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Abstract 

We examine the Boltzmann/Gibbs/Shannon Sbgs cind the non-additive Havrda-Charvat / 
Daroczy/Cressie-Read/Tsallis Sq and the Kaniadakis K-entropy from the viewpoint of 
coarse-graining, symplectic capacities and convexity. We argue that the functional form of 
such entropies can be ascribed to a discordance in phase-space coarse-graining between two 
generally different approaches: the Euclidean/Riemannian metric one that reflects indepen¬ 
dence and picks cubes as the fundamental cells and the symplectic/canonical one that picks 
spheres/ellipsoids for this role. Our discussion is motivated by and confined to the behaviour 
of Hamiltonian systems of many degrees of freedom. We see that Dvoretzky’s theorem provides 
asymptotic estimates for the minimal dimension beyond which these two approaches are close 
to each other. We state and speculate about the role that dualities may play in this viewpoint. 
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1 Introduction 


Entropy is one of the central concepts in Statistical Mechanics. Although initially introduced 
in thermodynamics, it has clearly superseded its modest origins and is currently widely used 
in numerous helds extending from dynamical systems and geometry to communication theory, 
complexity and far beyond. Due to its signihcance, considerable effort has been invested for 
almost 150 years, since its introduction by R. Clausius (ca. ~1865) in understanding its meaning 
and ways to calculate it for specihc systems. Fundamental contributions and interpretations, 
in various contexts, were made by L. Boltzmann, J.W. Gibbs, J. von Neumann, C.Shannon, 
A. Kolmogorov, A. Renyi, I. Csiszar, E. Jaynes well as numerous other scientists and engineers 
ever since. The functional forms studied by most of the above people resemble each other very 
closely, so they tend to be treated together as one and the same entropy. For the purposes of the 
present work alone we will pretend they are the same, although conceptually they are clearly 
very distinct from each other even in the specihc context of Statistical Mechanics HI Eli- 
Following this abusive bundling, we will be referring in the sequel to the Boltzmann / Gibbs / 
Shannon functional form 

Sbgs = -ks ^ Pi log Pi (1) 

i&I 

where /c^ is a constant which is identihed in Statistical Physics as Boltzmann’s constant. In 
(1) i is an index taking values in a hnite cardinality or countable set J, which is used to label 
the probabilities of possible outcomes. 

As is well-known, ( 1 ) has a subjective character in Glassical Physics [HE]- The probabilities 
Pi appearing in ( 1 ) depend not only on the actual system under study but also on the level of 
ignorance about the system by an experimenter/observer. In many occasions, it is worthwhile 
to consider systems with continuous rather than with discrete sets of outcomes. In such cases, 
we naively translate the dehnition ( 1 ) into the continuum, in which pi morph into a probability 
density function p : — )■ R+. The details and the exact way of considering such a continuum 
limit may be a highly non-trivial process, during which one may have to introduce a metric or 
a homogeneous structure QIZI etc in order to reach a well-dehned, and unique, result. Ignoring 
such subtle and important issues, the one would naively get 

Sbgs = -kB / p{x) logp(a:) dp ( 2 ) 

where p is an appropriate measure on the sample space 971. In the case of Hamiltonian systems 
of many degrees of freedom, which is our object of study, 971 usually stands for the phase space 
of the system, endowed with a Riemannian metric g with p being the unique Riemannian mea¬ 
sure associated to 0 . 
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There are several problems with (2) though: one is that it is not coordinate independent 
HlEli- This can be traced back to the fact that, very much like (1) from which it was naively 
inferred, (2) does not depend on the graph of p, but just on its range of values. The functional 
( 2 ) is not coordinate independent (diffeomorphism invariant). A second objection, related to 
the above comments, is that taking naively the limit of discrete pi to the continuum p gives a 
divergent expression, the infinite part of which has to be judiciously subtracted before ( 2 ) can 
be properly used or interpreted. 

Fortunately, using a relative entropy expression, a la Kullback-Leibler for instance, addresses 
the first problem. So what one actually computes in classical Statistical Mechanics is not really 
the “absolute” entropy but rather a form of a relative entropy of a probability distribution with 
respect to an underlying background measure. If such a reference measure is independent of the 
details of the particular model at hand, then it results in an additive constant which eventually 
becomes irrelevant as almost all experimentally verihable quantities involve entropy variations 
rather than the “absolute” value of the entropy itself. One chooses as a reference measure the 
probability density function resulting from the continuum limit of a discretization, usually by 
cubes of side length y/h of the phase space 971. 

One could question some of these statements, presenting in a counter-argument the case of 
the Sackur-Tetrode equation which gives an expression for the (absolute) entropy of a classical 
monatomic ideal gas 

+ (3) 

where U is the internal energy of the gas, m the mass of each atom, V the volume that the 
gas occupies and N the number of atoms. It seems that (3) contradicts the above statements 
pertaining to the additive constant. However the appearance in (3) of h, which is an arbitrar¬ 
ily chosen regularisation parameter in classical Physics and only acquires physical signihcance 
as Planck’s constant in Quantum Physics reinforces, rather than contradicts, the above state¬ 
ments. The implicit phase space discretisation by cubes of side \/h appearing in expressions 
such as (3) plays an important part in the present work. Its implicit, therefore often forgotten, 
presence is also at the heart of a recent controversy, to be mentioned below, about the existence, 
appropriate form and physical signihcance of the continuum limit of the non-additive entropy 
5 ,. 


As seen in the previous paragraph, in order to overcome the inhnities that inevitably creep 
in the transition from (1) to (2) we can subtract a renormalisation constant. Alternatively we 
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can turn to Quantum Physics and using the non Neumann operator functional 

SvN = -tr (p logp) (4) 

where p stands of the density operator (matrix) on the (kinematic) Hilbert space of the wave- 
functions and tr is the trace over a basis of such a Hilbert space. The von Neumann entropy 
may still need some form of regularisation and renormalisation, especially in Quantum Field 
Theory. Its fundamental “drawback” is that one has to know the underlying Quantum Theory, 
or at least to have a held theory description of the effective degrees of freedom before it can be 
properly implemented. An underlying quantum theory may not be known, as in the case of a 
quantum theory of gravity, for instance. We would not want to even start a discussion about 
the technical obstacles of actually computing (3) for particular systems. However, becoming 
aware of such well-known and long-ago resolved issues pertaining to Sbgs shows the subtlety 
and care with which the concept of entropy has to be treated. 

An additional incentive for looking into fundamental issues pertaining to entropy comes 
with the great success of Sbgs in quantifying the thermodynamic properties of systems at 
equilibrium. Such a success is seen in the agreement of the predictions derived by using Sbgs 
with numerous experimental observations, since the formulation of Sbgs- But what is the 
source of such a spectacular success of Sbgs^ The dictum that we use Sbgs “because it 
works” or it is the “cannon” cannot be possibly satisfying, if someone is interested in getting 
a better understanding why things work the way they do. It is not obvious, for instance, why 
or whether Sbgs precisely describes the collective behaviour of systems with long-range inter¬ 
actions, non-ergodic phase space evolution etc., not to even mention systems out of equilibrium. 

To understand the limitations of a particular functional one can compare its properties, 
predictions etc with those of another judiciously chosen or intelligently constructed functional. 
The Havrda-Charvat/Daroczy/Cressie-Read/Tsallis entropy Sq O [6], [Tl [HI [9], HO] or the (Kani- 
adakis) k— entropy [HI [121 [13 [13 [H]) both to be dehned and used below as “alternative” 
(meaning in different regimes, or for different systems) functionals io Sbgs- In addition, nu¬ 
merous other entropic functionals that have been recently introduced and used in Statistical 
Mechanics [10], and more particularly the generalised exponential families specihc members of 
which are Sq and which were analysed in [161 [HI (Hj, can also be considered as playing, 
in some part, such a role ra- A better understanding of such functionals and the determina¬ 
tion of what are the essential physical features of the systems whose collective behaviour they 
describe will not only help appreciate their signihcance but also set some boundaries to the 
complete dominance of Sbgs in Statisical Mechanics. Therefore, such an effort it will also help 
us understand better Sbgs itself. 
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One of the many, still nnanswered, qnestions pertaining to Sq, and the nnmerous other 
non-additive entropic fnnctionals is their dynamical basis. If Sbgs successfully describes 
systems having an ergodic evolution in their conhguration or phase space, then what are the 
underlying dynamical features of systems, if any, described by Sq, S^, etc? The present work 
is partly motivated by and echoes to some extent, the general viewpoint described in as well 
as some of the fundamental issues pointed out in [1]. Even though [1] was written more than 
a decade ago, and despite the intervening considerable activity in understanding aspects of Sq, 
Sk and other non-additive entropic functionals, it is probably fair to state that most of the 
fundamental dynamical and statistical questions that | 1 ] had pinpointed remain unclear to this 
date. 

In an attempt to address such questions, we explored in our relatively recent work [iniEoi 
|2Tl |22l |23l [211 [25l [26l |2I], some formal consequences of the dehnition of Sq. At no point 
however did we deal in any part of these works with the actual nature of Sq per se. We just 
conhned ourselves to formal algebraic and geometric structures and conclusions stemming from 
its functional form. One of our key assumptions was that some of the algebraic properties of 
Sq are not emergent from statistical averaging, but they directly reflect dynamical properties 
of the phase space of the system. In other words, we assumed that such algebraic properties of 
Sq are “typical” of the underlying Hamiltonian system whose statistical behaviour is described 
by iSg. A part of the present work is to investigate this assumed “typical” behaviour and try to 
determine how it may dictate, at least some parts of the functional form of the entropy used 
to describe such systems. 

At the core of the present work is a question that appears trivial at hrst sight: if one knows 
the microscopic evolution of a Hamiltonian system of many degrees of freedom, can this person 
predict, or pick among various “reasonable” entropic functionals a unique one or, to be less 
ambitious, a class of entropies that would successfully describe the macroscopic behaviour of 
the system? The obvious answer appears to be negative, as statistics seems in the eyes of many 
to be completely independent/dissociated from the underlying dynamics. It seems that we can 
successfully do the former, as in the case of Sbgs without knowing almost anything about the 
latter. This is certainly the viewpoint advocated, among others, by J.W. Gibbs, L.D. Landau 
and A.I. Khintchin who consider the underlying dynamics to be largely irrelevant, inasmuch 
as the ergodic hypothesis can be used to justify the choice of the micro-canonical ensemble. In 
this viewpoint the success of Statistical Mechanics is ascribed to the large numbers of degrees 
of freedom of such systems |29l [32] . The quantihcation comes by the Central Limit Theorem 
which “justihes” the “ubiquity” of the Gaussians in physical, and not only, processes. However, 
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one may wish to note that the Central Limit Theorem does not hold if the random variables 
are not independent or weakly correlated. When there are non-trivial (“strong”) correlations 
among such variables the question naturally arises as to which statistics, hence which entropic 
functional, is appropriate for describing systems having such properties. This also emphasises 
the question about the meaning of “independence” and how it needs to be modified, if at all, 
for the cases of these “different kinds” of statistics. 

By contrast, we follow the view of L. Boltzmann (in part), P. Ehrenfest and A. Einstein [33] 
according to which the underlying dynamics is at the core of the thermodynamic behaviour 
of a system. We believe that the recent emergence of numerous entropic functionals and the 
explorations into the realm of non-ergodic evolutions in phase space, make the underlying dy¬ 
namical explorations highly desirable and potentially enlightening. We view the emergence of 
an entropic functional form for a Hamiltonian system with many degrees of freedom as a man¬ 
ifestation of a dissonance in phase space: usually one coarse-grains [28l |29l ISO] |3T] phase space 
by using cubes of side length \/h. Such cubes however do not behave well under canonical trans¬ 
formations. Based on the symplectic non-squeezing theorem and the subsequent formulation 
of symplectic capacities as fundamental constructions in symplectic geometry, it is probably 
more prudent to coarse-grain DJI in terms of ellipsoids. This happens because all symplectic 
capacities have the same value on ellipsoids. Generalizing cubes into convex polyhedra to take 
into account the composition of the newer, non-additive, entropic forms, we can see such en¬ 
tropy as arising from the difference in coarse-graining between ellipsoids and convex polyhedra. 
We choose the Banach-Mazur distance to quantify such a difference. Hence the problem re¬ 
duces to determining the Banach-Mazur distance between polyhedra and ellipsoids in DJI of 
typical side/radius length \/h. Since ellipsoids are minimal from the viewpoint of dynamics 
/ symplectic capacities but the polytopes do not have any a priori lower bound on their size, 
we will consider the distance between such polytopes and the largest spheres/ellipsoids that 
can be inscribed in them. A central result in the asymptotic limit of large n is provided by 
Dvoretzky’s theorem, the lower bound in the dimension of which gives rise to the functional 
form of Sbgs and provides the leading asymptotic form for non-additive entropies. 

In Section 2, we present some of the properties of Sq and that we need in this work. 
In Section 3, we briefly discuss the geometry of “independence” and aspects of phase space 
coarse-graining. In Section 4, we present background material about Hamiltonian systems and 
symplectic geometry. In Section 5, we present basics of convex geometry/analysis needed to 
follow our exposition. In Section 6, we discuss Dvoretzky’s theorem and dimension and the 
role played by dualities in this viewpoint. Section 7 contains conclusions and some speculations. 
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2 Structures induced by two non-additive entropies. 

Two of the most commonly used non-additive entropies, which have attracted substantial at¬ 
tention recently are presented in this Section. Pertinent properties, for this work, are also stated. 


2.1 The Tsallis entropy Sg and its induced operations. 


The Havrda-Charvat [5], Daroczy |6], Cressie-Read mn, Tsallis [HI Ho] entropy Sg introduced 
and developed, in part, in the context of Statistical Mechanics by Tsallis for a discrete set of 
outcomes {pi}, parametrized by the index set I and with i G /, is given by 


^g[{Pi}] 



(5) 


where fcs is the Boltzmann constant. Its continuum analogue for a sample space equipped 
with a measure absolutely continuous with respect to the Lebesgue measure with Radon- 
Nikodym density p is naively assumed to be [911I0] 

Sq[p] = kB {^~ J dvoln^ ( 6 ) 

Here dvoln represents the inhnitesimal volume element of when it is a Riemannian mani¬ 
fold 971, as is usually the case for the Hamitonian systems of many degrees of freedom which 
are the focus of our attention. It should be noted that most recently there has been a contro¬ 
versy regarding the validity of this naive extension of Sg to continuous variables, without either 
side being dehnitively convincing, in our opinion |34l ES] EH EZl EH EH EH SO] . The contro¬ 
versy brought about by [3l| is intimately related to the implicit normalization of any entropy 
functional, such as Sbgs for instance, required to make its dehnition coordinate independent 
(diffeomorphism invariant). It is usually provided by the discretizaton of 971 in cubes of side 
length \/h, of the density distribution arising from its continuum limit. For Sbgs due ore the 
presence of log it results in a constant that is additive and hence can be ignored inasmuch 
as entropy differences are the only relevant quantities in physical predictions. By contrast, 
such a term is not additive, but rather multiplicative, therefore cannot be omitted in (6) in 
considering entropy differences. Subsequently the above authors have presented their views on 
this and related matters that may make the use of (6) rather questionable. This matter is of 
interest but not of central importance in the line of arguments and viewpoint of the the present 
work, therefore will sidestep these issues in what follows, and keep using (6) pretending that 
this naive is indeed a valid generalisation to the continuous case. 
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The nonextensive parameter q can generically take any valnes in R. There has been 
a recent proposal to extend its validity to g G C which is certainly worth looking into as 
well as considering the associated interpretation of snch an extension |1T]. To have desirable 
properties, snch as relative insensitivity to rare events, convexity, and decay in a polynomial 
manner [10], and following onr past work [191 EHl EH EH EH EH EH EH EZ] as well as the 
more recent [381139], we will assnme that g G [0,1] C R everywhere in the seqneh We straight¬ 
forwardly notice, that for g —)• 1 one recovers Sbgs- We will set henceforth fcg = 1 for brevity. 


Conventionally, two snbsystems A,B (Z fl are considered independent [H] if their marginal 
probability distribntion fnnctions are related by 


Paub = PaPb (7) 

Here AUB indicates the system resnlting from the interaction of A and B. For snch snbsystems 
<Sbgs is easily seen to be additive, namely 


— Sbgs{^) + <Sbgs{B) ( 8 ) 

The entropy Sq however is not additive [10], at least not in the conventional sense, as it satishes 

Sq{AUB) = Sq{A)+Sq{B) + {l-q)Sq{A)Sq{B) (9) 

This lack of additivity is nsnally ascribed to the long-range spatial and temporal correlations 
of the systems that Sg entropy conjectnrally describes [TO]- For systems described by snch Sg, 
additivity is manifestly restored if the addition is redefined as [IHHl] 

x(Bqy = x + y + {l — q)xy (10) 


Then 


<Sq{A U B) — iSq(H) 0g Sq{B) 


( 11 ) 


It took sometime before a generalized prodnct, distribntive with respect to the addition (10) 
was discovered [IH [E] Even thongh [35] and [19] gave different forms of snch a prodnct, 
conjectnrally eqnivalent, we will be nsing here the one introdnced in [35] as more elegant and 
easier to work with. The generalised prodnct tnrned ont to be 


1 ( log[l + (l-9)3,] log[l + (l-<?)a] j 

x®qy = -■ I (2 — g) ~ ^j 

The dehnition of the generalised prodnct (12) appears to be somewhat arcane. However, the 
motivation behind its constrnction becomes more transparent, in onr opinion, if we see it as a 
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result of demanding the commutativity of the diagram [20] 


X 


( 13 ) 


X K, 


Putting together the two generalized binary operations (10) and (12), we set up [OOllOT] a one- 
parameter family of deformations of the set of reals denoted by M.g. An explicit isomorphism 
between these two sets Tg : M —)■ Mg is given by [201EU 

(2 -qY -I 


T-gW = 


1-q 


ge[0,l) 


In terms of this held isomorphism, the product (12) can be rewritten as 

Tg{X-y) = Xg®q Vg = Tg{Tg^{Xg) ■Tg^iyg)) 


(14) 


(15) 


By the above construction, we have in effect reduced the differences between Sbgs and Sg 
to the differences between M and Mg. This can more formally seen through a comparison 
between the axioms used to determine Sbgs [ISl HZ] and Sg [45] 142]. 


In closing this Section, we should point out that there are several distinct and non-equivalent 
dehnitions of g-exponentials in the literature (see e.g. [50]) quite frequently associated to quan¬ 
tum groups, which have nothing obvious to do with Sg. With eyes to the next subsection, 
the same words of caution also apply to the several k- distributions existing in the literature 
(e.g. [SU 152] in space plasmas) which have nothing obvious to do with iS^. Due to this lack of 
uniformly in nomenclature, one should be careful about the exact functional forms are used in 
each occasion. 


2.2 The K-entropy and its induced operations. 

Among the many entropic functionals that have been constructed over the years, the K-entropy 
iSk has also attracted some attention since its introduction [mosiEiiiiE]. Unlike Sg whose 
origin can be traced to the thermodynamic formalism, the origins and possible scope of are 
far more concrete: they rely on attempts to understand the thermodynamic behaviour of the 
free relativistic gas, a system whose thermodynamic behaviour has proved to be far harder to 
describe than could be naively suspected. Since we do live in a relativistic world, where locally 
the principle of Relativity describes many physical phenomena, determining how it dictates the 
collective behaviour of systems of many degrees of freedom may be of considerable importance. 







The K-entropy was introduced as a functional that generates through the variational princi¬ 
ple, a given the ^-exponential distribution that arose from arguments pertaining to non-linear 
kinetics m- Lorentz invariance is already built into the underlying dynamics in this formalism. 
The K-entropy was defined directly for a continuous probability distribution with density p on 
the sample space hi as mm 

‘ 5 «[ p ] = + c{-k)p^-^] dvoln, c{k) = ( 16 ) 

where Z is a normalisation constant and k G R. So far as the author knows, there is no standing 
proposal to extend k G C although we do not see any reason what this would not be feasible, if 
the need arose and an appropriate physical motivation and interpretation would be provided as 
in the case of Sq. We expect about this continuous functional form worries/object ions similar 
to the ones that arose for Sg which were alluded to above. The discrete analogue of (16) for a 
set of outcomes / with corresponding probabilities {pi}, i E I would naively appear to be 

+ (17) 

i&I 

with c{k) as in (16). One should be quite careful though in providing such naive discrete gen¬ 
eralization, if one is interested in maintaining for (17) some form of Lorentz-invariance as the 
one that a gave rise to (16). It is well-known, and probably obvious, that discrete structures 
violate manifest Lorentz-invariance, something that has presented major technical challenges 
to proponents of quantum gravity theories. The solutions, which could also be adopted here, is 
to either forego completely any requirements for even remnants of Lorentz-invariance in (17), 
or to use arguments relying on randomness that preserve such a structure, as was done, for 
instance, in [53] for the case of causal sets. 

One can immediately observe that 

lim iSk = Sbgs (18) 

It is also immediately obvious that are not additive with respect to the usual addition and 
that to restore manifest additivity one will leave to dehne the generalised sum as nails] 

X ® y = x\/ 1 -|- + yVl + (19) 

where |fi:| < 1, mirroring (10). This can be re-written as 

K 1 

x®y = — sinh (arcsinh(Ka;)-|-arcsinh(K?/)) (20) 

to resemble more closely the generalized product, to be dehned next. The generalised product 
mm, which is distributive with respect to the generalised sum (19) and mirroring (12) turns 
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out to be [ISl US] 


x®y = — sinh(— arcsinh(Ka;) arcsinh(K 2 /) 

\ K 


K 


( 21 ) 


Then, in parallel to the case of iSg, one [131 US] can define the deformed field = (R, ©,i 
and set set up [131CH] a field isomorphism : R —)■ R^ which is given explicitly by 


Tk{x) = — arcsinh(Ka;) 

hi 


with an inverse 


= — sinh(Ka;) 

in analogy with (14) for Sq and also mimicking (15) we get 

T^(x-y) = X, 


( 22 ) 

(23) 

(24) 


We are not aware of an existing axiomatic formulation of the K-entropy, but we do not consider 
this as a drawback at a physical level, but rather as an open question at the formal level that 
remains to be addressed, if such interest arises, in the future. 


2.3 Features of Sq and S^, entropies. 

We observe from the above structures, that even though Sq and are, arguably, the two 
most developed non-additive entropic functionals in Statistical Mechanics to date, they are not 
really all that different from each other in terms of their induced structures. For instance, 
one can easily argue as in [211 IS] , that also describes cases of “weak chaos”. To be more 
precise [211 S] , one can straightforwardly see that if a system is described by the K-entropy 
and the composition (20) or (21) is a reflection of its phase space metric properties, rather 
then being an emergent property due to the statistical averaging, then the largest Lyapunov 
exponent of the underlying dynamical system will be zero, in complete analogy with the case 
of Sq. This commonality can be traced back to the similarity between the functional forms 
of Sq and both (6) and (16) can be seen to have a functional form that is asymptotically 
exponential. These functional forms are actually suggestive of the different parametrizations 
of the hyperbolic space [55]. Of course, this does not mean that the actual functional forms of 
Sq and iSk are the same, or that they will give rise to the same physical predictions, but they 
should share asymptotical, some common features such as describing weak chaos. It would be 
of great interest to compare the features of the systems that are described be each one of these 
two entropic functionals. We believe that someone should be able to say some similar things 
for many, if not necessarily all, probability distributions belonging to the exponential family, 
aspects of which have been developed in [16113 ITS] . 
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The non-uniqueness of Sq, at least from the viewpoint of its composition properties, but the 
fact that it is a part of a larger family of functional forms that share many common features was 
also briefly touched upon in [27] . It was noticed in [27] that even though Sg was an interesting 
case of a functional form belonging to the displacement convexity class T>Cn for 


it was not unique, by any means. Its uniqueness was restored, when in addition to (10), one 
could invoke the other axioms of [HI 09]. It is not obvious, to us at least, that some of these 
axioms, even if reasonable, should necessarily describe the properties of the entropy for systems 
out of equilibrium, with long range temporal and spatial correlations, etc. In accordance with 
the functional forms of the generalized entropies used in defining the Bakry-Emery-Ricci curva¬ 
ture thorough optimal transportation, as presented in [27], it may be more prudent to consider 
Sq as just one interesting example of an entropy having a polynomial/power-law form rather 
than as the unique entropy that may describe properties of the systems having properties that 
are mentioned above. Therefore the afore-mentioned interest in the analysis of systems that 
are described by one of such entropies, but not for the others, may help clarify their range of 
applicability or even the physical mechanisms leading to their effectiveness in describing the 
macroscopic properties of such systems. 



3 The “shape” of independence and phase-space coarse-graining. 

In this Section we analyse the concept of “independence” and the subsequent shape it induces 
on the fundamental cells in phase-space coarse-graining, with a view toward Sbgs and the 
non-additive entropies of the previous Section. 


3.1 Independence and cubes. 

The conventionally accepted formalisation of the concept of independent interacting subsystems 
was stated in (7) and is realised through the multiplicative character of the marginal probability 
distributions of the interacting subsystems. In the closely related case of random variables X 
and Y, they are called “independent” if 

E[X • F] = E[X] ■ E[F] (26) 

where E stands for “expectation value” of the corresponding random variable [56]. At the 
set-theoretical level, “independence” is conventionally encoded via the Cartesian product of 
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sets. From this viewpoint, the simplest set expressing set-theoretic independence is the unit 
cube in M” indicated by 

In = [-1,1]" (27) 

From (7) it becomes obvious that the concept of probabilistic independence is intimately related 
to multiplicative-like structures [56]. Hence modifications in the definition of multiplication, as 
in (12), (21) for instance, will have significant implications for determining what constitutes 
“independent” outcomes. Through all this, we want to indicate that the introduction of (12) 
which was induced by Sq and (21) for iS^, forces us to re-think and modify the concept 
of “independence” in the framework of the non-additive entropies (6), (16). This modifica¬ 
tion of “independence” is necessary due to the long-range temporal and spatial correlations 
of the systems that the non-additive entropies describe. When such correlations are present, 
the conventional definition (7) does not behave well (“covariantly”) with respect to the struc¬ 
tures induced by the underlying entropies such as Sq, or S^.. As stated above, if we want to 
assume that the macroscopic algebraic and geometric properties are a direct reflection of the 
microscopic dynamics, and not emergent due to statistics, then a more “covariant” definition 
of independence at the microscopic level would be 


VAVJB = VA®qVB 

(28) 

Paub = Pa® Pb 

(29) 


following (12) or (21) respectively. Since there is no obvious generalisation of the Cartesian 
product in such cases, it is hard to see how one can find the counterparts of the unit cube (27) 
for the generalised products (12), (21). 


3.2 Generalized independence and polytopes. 

The question that therefore naturally arises is how to determine such generalised “cubes” whose 
shape would express generalized independence the same way that (27) expresses conventional 
independence. An answer is provided if one thinks of the cube in a metric, rather than in 
a set-theoretic, way. Consider R". Its elements a are ordered n-tuples of real numbers a = 
(oi,..., a„). Their R 3 p-norm, for p > 1 so as the triangle inequality to be satisfied, is defined 
by 

( 30 ) 
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where | ■ | stands of the absolute value of its argument. The sup-norm in R" can be seen either 
as 

||a||oo = sup {|aj|} (31) 

or, equivalently, as the limit 

||a||oo = lim ||a|L (32) 

p—^oo 


With such norms, R" is a Banach space, indicated as or respectively. The ball indicated 
by Br{x) of radius r centered at a point a; of a metric space X with distance function d is 
defined by 



Br{x) = {y E X : 

d{x,y) < r} 

(33) 

and the sphere Sr{ 

x) of radius r is dehned by 




Sr{x) = {y eX: 

d{x,y) = r} 

(34) 

One can easily see 

that the cube is the unit ball of namely 



In = B^ 

’(0) 

(35) 


where the superscript explicitly denotes the sup-norm. The advantage of this viewpoint is 
that it can be carried over directly to inhnite dimensions, namely to the space of sequences 
(ai,..., On,...) with elements in R, namely to the Banach space Ip, p G [1, oo]. Actually there 
are several such reasonable infinite dimensional limits, depending on one’s goals, but we will 
not enter the details of this. Such inhnite dimensional limits are useful if one wishes to be able 
to consider the “thermodynamic limit” n cxo at some stage of these calculations. Moreover, 
such dehnitions can be generalised to uncountable spaces, such as the Lebesgue spaces Lp(R”') 
of p-integrable functions, to Orlicz, Sobolev and even more general function spaces |S7] that 
may be useful. 

One can then use the generalized operations of the deformed helds Rg and R^ instead those 
of the usual addition and multiplication to dehne the generalised cubes and 3'^ in exactly 
the same way as it was done for R"^ in the previous paragraph. This is possible because of 
the presence of the held isomoprhisms (14), (23) which being distance non-decreasing maps, 
they also preserve the order structure of R. Hence the induced topologies by the generalized 
operations of Rg, R^ are homeomorphic, the ordering of the elements of these sets is maintained, 
therefore the supremum has an unambiguous meaning etc. Given such dehnitions, the polytopes 
playing the role of the cubes 3^ for the generalized products (12), (21) respectively, can 
all be seen to be given by 

Z = (36) 
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and 


(37) 


3 “ = Uh) 

where the tilde ~ denotes the n-dimensional extension of its underlying isomorphism. 


3.3 Euclidean and dynamical aspects of coarse-graining with cubes. 

The dehnitions of such cubes are particularly important in the context of coarse-graining of 
the phase space [291 EH EHl EHl EDI ED. Coarse-graining was introduced by P. Ehrenfest and 
T. Ehrenfest in an attempt to explain the origin of macroscopic irreversibility, in the face of 
microscopic reversible dynamics. Many of these ideas can be traced back to L. Boltzmann. 
One way to implement coarse-graining is to divide the phase space of the microscopic dynam¬ 
ical system (Hamiltonian of may degrees of freedom, in the case of our interest) in cells and 
substitute the smooth probability density p of phase space by a piece-wise constant one pcg in 
each of these cells. The size of each cell is assumed to be small but it should not approach zero. 
For Boltzmann’s ideas about the behaviour of gases and the Sackur-Tetrode equation (3) the 
side length of each cube is taken to be '/h. Effectively what this approach to coarse-graining 
does is to combine elements of the microscopic evolution of the system with a periodic partial 
equilibration. The end result is to determine a macroscopic kinetic equation that does not 
retain any memory of the initial condition of the system but captures the evolution of these 
successive partial equilibrations [28 ] l58 [ l59]. 

The coarse-graining of the phase space, but in a different form than that described in the 
previous paragraph can be attributed to the approximate knowledge that we have about the 
system, even at the (quasi-)classical level [601 El]- ^ physical situation there is always some 
uncertainty, either about the dynamics or about the exact initial conditions of the system, 
or about both. Such uncertainties are frequently encoded in dynamics as “noise” or some 
other stochastic process through which the system interacts with its environment. “Noise”, or 
particular slowly varying background helds, can also be seen as encoding the collective effect 
of degrees of freedom in the system which although present may be considered of secondary 
importance at the energy, time etc scale of interest. This is the spirit behind the Langevin 
approach in constructing kinetic equations [62] . 

In addition, since it is impossible to prepare a system with absolute accuracy at some pre¬ 
determined state, we are inevitably led to consider not only a desired, or convenient, initial 
condition in our models, but a set of initial conditions that are reasonably close to the desirable 
one for the level of accuracy that we can tolerate in our predictions. Hence, one has to consider 
the evolution of sets of initial conditions under the given dynamics, with or without stochastic 
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sources. This uncertainty is expressed by performing a periodic “e-fattening” of the phase space 
evolution (orbit) of the system. After some judiciously chosen, for a particular model, amount 
of time, one “fattens” the Hamiltonian orbit, so that it initially appears to be like a tube. 
The question that arises is how such perturbations, assumed to be initially small, either in the 
initial conditions or through noise, affect the system under study. The initial hope that they 
would not affect the underlying system in any qualitatively significant way was proved to be 
too naive in [63] (see also [Ml), in the case of systems with phase space dimension dimfht > 2. 
This instability directly questions the physical relevance, for Statistical Mechanics, of the single 
orbit analysis of any particular dynamical system, even if it were practically feasible. Hence one 
is forced to consider the behaviour of sets of orbits which are initially near each other. Then 
one uses the ergodic theorem to substitute averages over orbits with averages with respect to 
appropriate measures over the whole phase space. This is a reasonable choice, assumed to be 
true, as ergodic measures are precluded from having a “complicated” phase space behaviour 
such as possessing attracting sets etc. This is direct implication of Birkhhoff’s ergodic theorem 
[65] . Either way, and irrespective of the reason or the way that one chooses to perform phase 
space coarse-graining, during such process some of the features of the underlying microscopic 
evolution are lost, a fact which is desirable if one wishes to capture the thermodynamic be¬ 
haviour of the system with the half-dozen or so (at most) macroscopic variables, as is usually 
the case. 

The question that arises then is how to perform the coarse-graining of phase-space. The 
process appears, and largely is, ad hoc. But it is fundamental for the dehnition of any entropic 
functional. Due to its importance one may wish to make such a process a bit less less ad hoc by 
employing even partial knowledge about the underlying dynamics of the system. The obvious 
choice is to assume that the phase-space is divided into cubical cells of side length \/h each of 
which has obviously a volume if dimiDf = n. That typical cells in the coarse-graining 
process should be cubes is not only supported by the fact that geometrically they express 
“independence” or due to their geometric simplicity, but also due to quantum nature of the 
underlying physical mechanisms. 

Probably the simplest realisation of this underlying quantum nature is the emergence of 
the unit cube of side \/h in the asymptotic expression for the spectrum of the Laplacian on dJl 
which is provided by Weyl’s asymptotic formula [66]. Weyl’s asymptotic formula applies to a 
bounded domain D C R ”, but this is not a problem in our case, since the cubes that we use 
to coarse-grain 971 have such a small side that can be considered effectively flat, to a first order 
approximation. Assume that such a domain D has also a smooth boundary and we indicate by 
Afc, k = 0,1,... ,n,... the eigenvalues of the Laplacian on functions / : D —)■ R subject to the 
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Dirichlet boundary condition J'Iqq = 0. Then the number N{A) of such eigenvalues which are 
smaller than A > 0 behaves asymptotically as 


, volBi ■ volQ , n 

N{A) - 

(27rh) 2 


A —)■ oo 


(38) 


where we have used cubes of side length y/h. This counts the number of quantum states of the 
Laplacian inside which can also be re-interpreted as the number of quantum cubical cells in¬ 
side fl, for macroscopic values of A such as the ones needed in thermodynamics, hence A —)• cx). 
Such validity relies tacitly on the fact that fundamental kinetic terms are always quadratic and 
that long memory effects that may give rise to non-Markovian evolutions described by anoma¬ 
lous kinetic terms are always an effective description arising due to the underlying statistics. 


Despite the above plausibility arguments, the choice of the fundamental cells to be cubes 
still remains somewhat arbitrary. It should also be considered as still not acceptable as it 
ignores a central aspect of the Hamiltonian dynamics on phase space: its canonical transforma¬ 
tion invariance, or in other words the existence of a symplectic structure on Wl. As will argue 
in the next Sections, choosing cubical cells for coarse-graining is probably the worst choice that 
someone could make in a metric sense, but probably the best in a measure-theoretical one: 
the best choice of a shape for the fundamental cells from the viewpoint of the Hamiltonian 
evolution would be (Euclidean) balls/ellipsoids instead of cubes. 


3.4 Riemannian aspects of phase space coarse-graining. 

An additional subtlety stems from the fact that the phase space on which the Hamiltonian 
evolution takes place is not usually R” but some Riemannian manifold DJI, with additional 
structure which we chose to overlook in the previous Sections. Even though any Riemannian 
manifold can be C^-differentiably embedded IS7I (or even smoothly, i.e. C^, 3 < k < oo em¬ 
bedded jEH]) into some R^, for N large enough, an intrinsic description is sought after that 
would allow us not to worry about intrinsic vs the embedding features in the resulting geo¬ 
metric description. This is very much in the spirit of Geometry since the time of K.F. Gauss 
and was implemented in General Relativity, for instance. Riemannian manifolds are metrically 
almost Euclidean. Many of their metric properties can be expressed in terms of their sectional 
curvature, which determines locally (second order deviation from “flatness”) the distances on 
|69i [701 in]. Among by-products of the sectional creature, the Ricci curvature determines 
the volumes of shapes lying in hyperplanes perpendicular to a given direction on DJI, such as 
the direction of the Hamiltonian evolution. As a result of such curvatures, an initial shape will 
be distorted even if parallel transported along a curve. Hence if someone starts by partitioning 
the phase space into cubical cells, for the purposes of coarse graining, and wishes to follow the 
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dynamics, the corresponding cells will become distorted cnbes, i.e. 2n-face polytopes, along any 
orbit of the Hamiltonian system. As long as the nnderlying dynamics remains invertible, then 
the nnmber of faces of snch polytopes will remain 2n, even if the areas of the faces will no longer 
be eqnal to each other. If, for whatever reason (snch as taking the thermodynamic limit) the dy¬ 
namics loses its invertibility |65], then snch cells may acqnire a larger or smaller nnmber of faces. 

To snmmarise the discnssion of this Section, coarse-graining and the cnrvatnre featnres of 
the phase space DJI force ns to consider not only cnbes bnt more general polytopes as the ba¬ 
sic cells of coarse-graining of phase-space 971. The need for snch generalisation from cnbes to 
polytopes becomes obvions, if one wishes to incorporate in the formalism the effects of general¬ 
ized prodncts snch as (12), (21) via their indnced generalised concepts of independence which 
are expressed geometrically throngh their indnced “unit” cubes such as (36), (37). In all this 
discussion so far, we have (on purpose) ignored the dictates of the symplectic structure of 911, 
which as will be seen in the next Section, point toward a very different, and largely incompat¬ 
ible, proposal on how to actually perform such a coarse-graining. 


4 Symplectic basics: capacities and the role of ellipsoids. 

In this Section, we provide some background on aspects of Symplectic Geometry/Topology that 
we need for our arguments, in an attempt to make the manuscript reasonably self-contained. 
Even though the concepts and facts that we present are very well-known to a mathematical au¬ 
dience, some of them are very non-trivial and have either only been proved relatively recently 
or they are still a subject of investigation. One might wish to consult some books, such as 
[721 [731 [H Ea [761 [771 [78] or reviews [79l [HQl El] to get a grasp of such matters that we can 
only very superhcially touch upon here. 

It may, for a moment, be worth thinking about the role of the Hamiltonian approach to 
Mechanics. There are several, well-known advantages over the Lagrangian formulation (and 
vice-versa). From our perspective, and for our purposes, the Hamiltonian approach is more 
suitable because it allows for more symmetries between its variables. By elevating the canoni¬ 
cal coordinates q^,i = 1,... ,n and the canonical momenta (not probabilities!) pi, i = 1,... ,n 
to equal status, the number of independent variables is doubled, hence there is greater possi¬ 
bility to detect and prohtably use otherwise hidden symmetries or invariances. To make this 
easier to understand, we can start from simple discrete case: suppose we have been given one 
point. Then there is very little in terms of operations and symmetries that one can detect, 
therefore very little latitude and substantial lack of direction in building, detecting or utilising 
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such structures. Now consider a set whose elements are multiple copies of this point. Then one 
can easily start by determining its automorphism group and its algebraic properties, one can 
build discrete geometric structures such as graphs or simplices etc and then by some from of 
reduction one can go back to the induced properties of such structures pertinent to one point. 
This approach seems to have been appreciated hrst by E. Galois. The spirit of the Hamiltonian 
approach follows, to an extent, these lines. 


4.1 Basics about symplectic vector spaces. 

Let "H denote the Hamiltonian of a system (of many degrees of freedom, eventually). Hamilton’s 
equations, as is well-known, are 

d* - ^ V- - n (39) 

were the dot indicates differentiation with respect to the evolution parameter (“time”). Since 
the canonical coordinates and momenta are on equal footing in the Hamiltonian approach, we 
can put them side by side as coordinates of a vector 

^ = {q\...,q^,pi,...,pn) (40) 


and re-express Hamilton’s equations as 

dqq 

C = qj,A; = l,...,2n (41) 

where the summation convention over repeated indices is assumed, and the matrix uj has ele¬ 
ments cujj, i, j = 1 ,..., n given by 


Uij 




(42) 


where 0^ and stand for the null and the unit nx n matrices with real entries. Moreover, we 
see that 

= -u, = -u, = -l2nx2n (43) 


These statements are abstracted in the dehnition of a real (hnite dimensional) symplectic vector 
space TJ which is a hnite dimensional real vector space TJ, of even dimension, endowed with an 
antisymmetric and non-degenerate bilinear form u, namely 


u{X,Y) = -u(Y,X), X,Ye^ 

and such that for any X 7 ^ 0 G there is H G T 1 such that 

u{X,Y) 0 


(44) 


(45) 
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The last equation is a non-degeneracy condition providing an isomorphism between TJ and its 
dual Tl* by 

X ^ ixu =u{X,-) (46) 

where ix denotes contraction of the symplectic form in the direction of the vector X. A different 
way to express the non-degeneracy of u is by requiring that 


UJ 

n\ 


n 


(47) 


where the n\ just fixes a normalisation, be a volume form on TJ, which is unique (up the 
normalisation). Following the standard algebraic practice, one dehnes 211 to a be a symplectic 
subspace of 2J if a; is non-degenerate on the linear subspace 211. Obviously, the antisymmetry 
condition of u is satished on 211. 


Even at this level, one can see that symplectic vector spaces are substantially different from 
spaces endowed with symmetric bilinear forms. Requiring antisymmetry, instead of symmetry, 
of a bilinear form on such a space has proved to have profound consequences, some of which 
will be noted below. One such consequence is that the concepts of symplectic and Euclidean 
orthogonality are very different: Let if be a linear subspace of the symplectic vector space 2J. 
Then the (symplectic) orthogonal complement if-*- of if is dehned by 


if^ = {X e 2J : cn(X, X) = 0, V X G if} 


(48) 


Unlike the case of Enclidean geometry, if and if-*- need not be complementary snbspaces, even 
thongh the non-degeneracy condition (44) implies that 


dim if -i dim if-*- = dim2J 


On the one hand, if they are indeed complementary, namely if 


ifeif^ = 2J 


(49) 


(50) 


then one can prove this is eqnivalent to stating that it is a symplectic snbspace 211 of 2J which 
is also eqnivalent to stating that 

ilnif^ = {0} (51) 

On the other hand, one can observe that that every vector X G 2J is orthogonal to itself dne to 
the antisymmetry of the symplectic form (42). These relationships between if and if-*- that are 
absent in Enclidean geometry can be generalized: if is called isotropic if if C if^, co-isotropic 
if if-*- C if and Lagrangian if if is both isotropic and co-isotropic, namely if = if-*-. Clearly, 
in 2-dimensions, the lines passing by the origin are Lagrangian snbspaces of the plane. The 
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subspace of canonical coordinates and that of canonical momenta are Lagrangian subspaces at 
a point of the phase-space of a Hamiltonian system. 

All these dehnitions are “strange” by Euclidean standards, as the Euclidean metric has been 
explicitly dehned to exclude such occurrences. However the vanishing of a bilinear form may be 
more familiar in the context of Special, and General Relativity. If u were a symmetric, rather 
than antisymmetric, bilinear form then the fact that 

u{X,X) = 0, (52) 

would dehne the light-like vectors X G 5J. Since the usual 4 — dim Minkowski “distance 
function” 

ds^ = —c^dt^ + dx^ + dy^ + dz^ (53) 

where c indicates the speed of light, can be formally seen as arising from the 4 — dim Euclidean 
metric through the formal substitution 


11—> —it 


(54) 


one may be lead to suspect that there is an intimate relation between a Euclidean metric, an 
(almost) complex structure and the symplectic structure in a vector space 05. The (almost) 
complex structure of 05 can be dehned as an anti-involution J, namely J ; 05 —)■ 05 such that 

= -1 (55) 


In more concrete terms and for 05 
metric form 


with respect to a Cartesian base J has the antisym- 



Then 05 can be made into a complex vector space by dehning, for a, 6 G R and X G 05 


{a + ih)X = aX + hiX 


(57) 


where the action of J on X has the effect of a multiplication by —i. It is no conhdence that 
(42) and (56) have a similar form; indeed, if we indicate by (•, •) the Euclidean inner product 
on R^"" and for X, X G R^” we see that 


u{X,Y) 

or, since = -l2nx2n 

u{X,iY) 


(JX,X) 

(58) 

{X.Y) 

(59) 
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The compatibility conditions (58), (59) have profound consequences in the case of manifolds to 
which we will turn in the next paragraphs. 

Before that though, it may be worth mentioning another difference between the symplectic 
and the Euclidean cases: one can prove that in any symplectic vector space one can choose 
a “symplectic basis” where the symplectic form will have essentially the same form as in (42), 
or to be more precise, one can pick a basis e*, fj, i,j = l,...,n so that 

u{ei,ej) = 0, = 0, u{fi,ej) = i,j = l,...,n (60) 

From our experience with Hamiltonian dynamics, one can see that this is an abstraction of the 
fact that, locally, the symplectic form looks like (42) in each 2-plane made up of the canonical 
coordinate g* and its conjugate canonical momentum p* for i = Hence (58) is the 

antisymmetric/symplectic analogue of the Gram-Schmidt diagonalization process of symmetric 
bilinear forms. We see that even though in the latter case there are numerous possibilities in 
this diagonalization process, in the symplectic case, all symplectic vector spaces are locally the 
same. This is behind Darboux’s theorem and the lack of local symplectic invariants for the 
case of symplectic manifolds in sharp contrast to the Riemannian case. 

Let T1 be a symplectic vector space endowed with the symplectic form u. A linear map 
(p ; TJ —> TJ is called symplectic or canonical if it preserves the symplectic structure, namely if 
the pull-back form ip*u obeys 


{<p’‘u)(X,Y) = (61) 

As is clear from the terminology, symplectic maps are the linear canonical transformations of 
Hamiltonian Mechanics. These maps can be represented as matrices <h, and it turns out that 
they obey 

det $ = 1 (62) 

whose geometric interpretation is that they are volume-preserving. This is the linear formula¬ 
tion of Liouville’s theorem. The obvious question on whether there is any difference between 
volume preserving and symplectic maps, in the case of manifolds, will be discussed in the sequel. 
To complete the discussion of symplectic vector spaces, it turns out that if (ifi, cai) and (il 2 , 1 ^ 2 ) 
are two symplectic vector spaces of the same dimension, then there is a linear isomorphism 
99 : ill —)■ if 2 such that (p*uj 2 = oJi. Hence symplectic vector spaces of the same dimension are 
symplectically equivalent (indistiguishable from a symplectic viewpoint) as was also previously 
mentioned. 
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4.2 About symplectic manifolds. 

To define symplectic manifolds one follows the same steps as in the Riemannian case, 
bnt snbstitntes the symplectic for the corresponding Enclidean strnctnres. Hence someone 
picks local patches (“charts”) of symplectic (instead of Enclidean) vector spaces and glnes 
them together nsing symplectic (rather than regnlar) diffeomorphisms. The details of snch 
intnitively obvions, bnt non-trivial and cnmbersome at times, constrnction can be fonnd in 
the references. It is worth mentioning at this point one conseqnence of the fact that we nse 
symplectic diffeomorphisms to glne together patches of symplectic vector spaces: the symplectic 
form oj is postnlated to be closed on DJi 

duj = 0 (63) 

where d denotes exterior differentiation on the space of differential forms of DJI. One way to 
interpret the reqnirement (63) is to nse the canonical symplectic base (60) translated to the 
case of DJI. A non-degenerate 2-form (hence anti-symmetric) u is closed (63) if and only if at 
each point of DJI there are coordinates {qi ,..., g„,pi,... ,p„) snch that 

n 

uj = dqi A dpi (64) 

i=l 

This theorem is dne to G. Darbonx. It expresses the fact that all symplectic manifolds are 
locally symplectically indistingnishable. Hence any non-trivial invariants of snch manifolds will 
have to be global. Contrast this with the Riemannian case: in the Riemannian case there are 
plenty of local invariants which are encoded throngh the Riemann tensor at each point of DJi and 
its mnltiple covariant (properly symmetrized) derivatives and their contractions. One can also 
see the lack of local strnctnre of symplectic manifolds “eqnivariantly”: symplectic strnctnres 
can be seen as the “qnotient” of a topological space locally homeomorphic to R” nnder a 
set of “symmetries” (actnally re-paramentrizations, therefore more akin to gange rather than 
global symmetries) the action of the gronp of symplectic diffeomorphisms. If the set of snch 
symmetries is large enongh, it is entirely possible that the resnlting strnctnre is nniqne: this is 
what actnally happens in the case of symplectic manifolds, locally at least. The non-degeneracy 
condition (45) carried over to the case of a symplectic manifold DJI can be seen as expressing, 
via the isomorphism (46), an isomorphism between the vector helds and the one forms of DJI, 
namely its tangent TDJl and cotangent bnndles T*DJl. Consider a vector held X G TDJl and the 
Lie derivative along it, indicated by [69] of u. Then according to Cartan’s formnla [69] . 
we hnd 


CxOJ = d{ixOj) + ixdoj 

If we assnme that u is closed {du = 0), then 

(65) 

CxOJ = diixoj) 

(66) 
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Due to the non-degeneracy condition(45), for any smooth function / ; OJt —> R (“Hamiltonian”) 
there is a unique vector held Xf : TDJl —)■ R (“Hamiltonian vector held”) such that 

ixf = df (67) 

Substituting (67) into (66) one gets that 

jCxf^^ — 0 ( 68 ) 

Therefore u remains invariant under the how generated by Xf. This is very desirable and natu¬ 
ral from a physical viewpoint: in the case that / = "H, we would like the symplectic (canonical) 
form to remain invariant under the (“time”) evolution of the system. Turning the argument 
around, we see that the invariance under evolution of the symplectic form is equivalent to re¬ 
quiring it to be closed, something which is usually assumed from the outset without further 
explanation. 

A second point arising from the above short argument is to see that the local uniqueness 
of the symplectic structure, expressed through Darboux’s theorem, can indeed be seen from 
an equivariant viewpoint as previously suggested. The set of “symmetries” of the symplectic 
structures is inhnite dimensional since it is generated by the “Hamiltonian vector helds” Xf 
corresponding to any smooth enough functions /. This is a typical situation in topological field 
theories, for instance, and it is quite extensively employed in field and string theories on models 
with enough supersymmetries etc. It may be worth comparing this to the Riemannian case in 
which the isometry group of any metric is hnite dimensional, something that allows for local 
structure that is able to differentiate between different Riemannian spaces. 

Consider a system which is described by an antonomous Hamiltonian Ti and let S'-^g = Sq 
indicate the level set "H = "Ho where "Ho is a regnlar valne of Ti. As a result, its inverse image in 
971 is of codimension-1 (hypersnrface). Let be the Hamiltonian vector field corresponding 
to 77. Following (46) 

dH{X) = uj{Xn,X) (69) 

which gives 

dniXn) = u{Xn,Xn) (70) 

Hence at each point of 971, dH is the kernel of the map T97T —)■ R. In other words, the 
Hamiltonian vector held is tangent to, and therefore it preserves, the level sets Sq. This 
resnlt is not totally nnexpected if one looks at it from the viewpoint of the local compatibility 
between the simplectic and the almost complex strnctnre expressed in the case of linear spaces 
in (58), (59). The role of J (56) is to generalise the complex nnit i, hence its action can be seen 
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to amount to a rotation by a right angle. The usual gradient vector held is perpendicular 
to the level set Sq. Hence to compute the symplectic gradient, following (58), (59) one has 
to would rotate this by an appropriate right angle thus making it tangential to the level set 
Sq. The above statements of this paragraph are a formal way of expressing the fact that the 
trajectory in phase space 971 of an isolated system, evolves in the constant energy hyper-surface 
Sq, a realisation which lies at the foundation of the micro-canonical approach in Statistical 
Mechanics. Since the Lie derivative Cx is a derivation, and given (68), we get 

^XH~r = 0 
nl 

which is the familiar Liouville’s theorem on the invariance of the phase space volume uj^/nl 
under Hamiltonian flows. One can state more in this context: assume that the object of inter¬ 
est is not a particular Hamiltonian function but instead the level set Sq and the symplectic 
form uj. Then if there are two Hamiltonian functions "Hi and 7^2 for which Sq is a level set 
for generally distinct values of 'Hi and H 2 , then these two Hamiltonian functions will have the 
same trajectories on Sq. 



4.3 The symplectic non-squeezing theorem. 

We saw that symplectic geometry, in sharp contrast to the Euclidean/Riemannian case, is 
fundamentally 2-dimensional, something which can also be justihed in the following way. Let 
Di be a disk in and D 2 a subset of diffeomorphic to Di. If Di and D 2 have the 
same area then there is a symplectic diffeomorphism (p : Z7i —)■ D 2 . This is due to J. Moser. 
Hence in 2 dimensions, “volume preserving” and “symplectic” are adjectives that can be used 
interchangeably. Therefore in 2-dimensions what distinguishes symplectic manifolds from each 
other is their total volume. The question that arose is how much of all these statements 
can carried over in higher dimensions. A step toward answering this question is the Gromov 
(-Eliashberg) alternative [H2]: the group of symplectic diffeomorphisms of a 2n-dimensional 
connected symplectic manifold (97t, u) is G°-closed in the group of all diffeomorphisms of 971 or 
its G°-cfosure is the group of volume preserving diffeomorphisms of (971, ca). The fact that the 
former of these two alternatives is what actually occurs, was proved in the fundamental [85] . 
This result is also known as the “symplectic non-squeezing theorem” or even as “the principle 
of the symplectic camel”: consider the standard symplectic space (R^”,a;) parametrized by 
(xi,... ,Xn,Pi,... ,Pn) where each pi is canonically conjugate to the corresponding Xj. Let 
Zi{R) indicate the cylinder of radius R based on the symplectic 2-plane {xi,pi), namely 

Zi{R) = {(xi,... ,x„,pi,... ,p„) G R^" : x- -Fp. < (72) 

Then, if there is a symplectic diffeomorphism ip : R^" —)■ R^" embedding the ball (33) into 
Zj{R) we must have r < R. It should be noticed here that the exact choice of the symplectic 
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2-plane does not matter. Someone could choose any cylinder based on another symplectic 
2-plane {xj,pj) and the result would still hold. However this conclusion does not hold if the 
cylinder is based on an isotropic 2-plane {xi,Xj) or (pi,Pj) as a local rescaling, leaving all other 
coordinates unaffected, given by 

ip{xi,Xj) = (^X~^Xi, Xxj) , A G R\{0} (73) 

is a volume preserving transformation which is moreover symplectic and can still embed the ball 
B{r) into the cylinder Zi{R) for A < ■^. In words, what the symplectic non-squeezing theorem 
states is that it is impossible to squeeze a ball inside a symplectic cylinder if the ball’s radius is 
larger than the radius of the cross-section (base) of the cylinder. This rigidity does not apply 
for isotropic cylinders. The non-squeezing theorem can be seen as providing an obstruction for 
the existence of symplectic embeddings and it clearly shows that the “symplectic” and “volume 
preserving” classes of diffeomorphisms are not the same in dimension higher than 2, unlike the 
2-dimensional case. 

The non-squeezing theorem can be seen as a counterpart to Liouville’s theorem (71) which 
states that the symplectic volume of phase space remains invariant under a Hamiltonian (more 
generally: divergence-free) vector held. Liouville’s theorem allows for the arbitrary change of 
shape of any subset of phase space fht under symplectic/canonical transformations generated 
by a Hamiltonian vector held. This arbitrary change of shape, and in particular, the fact that 
its image under canonical transformations can become arbitrarily “thin” in phase space, very 
much like “oil in water” is at the heart of Boltzmann’s explanation of the macroscopic time- 
irreversibility of physical processes on the face of their time-reversible microscopic dynamics 
[n El la [84]. The symplectic non-squeezing theorem states that such an arbitrary change of 
shape of a given phase space volume is simply not possible under canonical transformations 
and explicitly provides a limitation. Hence in higher dimensions “symplectic” and “volume¬ 
preserving” are quite diherent terms. It is currently unknown what the ehect(s), if any, would be 
due to the non-squeezing theorem in the description of a Hamiltonian system of many degrees 
of freedom is. Would this provide some constraints in applying Boltzmann’s irreversibility 
argument with macroscopic consequences, or the presence of the many degrees of freedom would 
“wash out” such “small-scale” features, as the Central Limit Theorem does for independently 
distributed random variables? In many ways, Boltzmann’s irreversibility argument lies on the 
solid ground of Katok’s lemma [85l |65| which goes as follows: Let he two bounded 

domains of equal volume in both of which are diffeomorphic to the ball B[r). Indicate 
by AAB the symmetric difference between the sets A and B. Then for every e > 0 there is a 
Hamitonian Ti and an evolution parameter (“time”) t, so that 

no/ ((pt(ili) nil2) <e (74) 
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where (pt indicates the phase space flow generated by l-i calculated at time t. Due to this 
lemma, indeed any subset of DJI can “turn and twist” and become thin, overall, under canoni¬ 
cal transformations. What cannot happen however, according to the symplectic non-squeezing 
theorem, is the projections of such a shape, as it evolves, along the symplectic 2-planes to 
become thinner than the original ones. As stated above, it is still unknown what are the effects 
of such a limitation on the projections of the initial shape along the canonical 2-planes. From 
the viewpoint of the present work, one can claim that the concept of entropy, for Hamiltonian 
systems of many degrees of freedom, can be seen as a manifestation of this underlying symplec¬ 
tic rigidity described by the symplectic non-squeezing theorem. 

It may probably be worth noticing, that during the 30 years that have elapsed since the 
formulation and first proof of the non-squeezing theorem, several proofs different from the orig¬ 
inal one have also appeared in the literature, none of which is elementary or even relatively 
simple. In the face of this and in order to get a better feeling on why this theorem is true, 
one may wish to be less ambitious and try to present a relatively simple proof of the non¬ 
squeezing theorem in the case of linear symplectic diffeomorphisms as indicated, for instance, 
in [86]. The intuitive advantage of such an approach is that such a proof involves concepts 
more familiar to physicists. We And it strange that 30 years have passed since the proof of the 
non-squeezing theorem, but its significance has not been widely appreciated in Physics. The 
most notable exception, in our opinion, is the work of M. de Gosson and his collaborators, who 
have been consistently emphasizing the interpretation and implications of the non-squeezing 
theorem, mainly for the physical cases of systems lying on the interface between Classical and 
Quantum Physics [M] Eli EH] EH] ED] [9ll [92l |93l EU [95]. In this work, we rely considerably on 
the concept of “quantuum blob” [HH] EE] EH] EH] which will be presented shortly. 

4.4 About symplectic capacities. 

As it befits a fundamental work, the contribution of the non-squeezing theorem was profound 
in actually establishing symplectic geometry/topology as a distinct held of mathematics rather 
than an interesting, but mostly, afterthought. The work of Gromov [83] is also credited for 
developing concepts, such as the J-holomorphic curves that have proved to be enormously 
influential in a variety of contexts, some of which have significant overlap with developments in 
string/brane theories [96]. For our purposes, the significance of the non-squeezing theorem lies 
in that it provides an explicit construction for a class of (global) symplectic invariants called 
“symplectic capacities” [97] EH [MllTSl EH whose definition is the following: consider the 
class of symplectic manifolds (971, w), possibly with a boundary, of dimension 2n. A symplectic 
capacity is a map c : (971, u) —)■ R+ U -|-cxd with c having the following properties 
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• Monotonicity: If there is a symplectic embedding ip : (971,0;) ^ (9T, o;') then 

c(97l,o;) < c(9I,o;') (75) 

• Conformality: 

c(97l,Ao;) = |A|c(9Jt,o;), A G R\{0} (76) 

• Normalization: 

c{Br,u) = c(Zr,L0) = TTV^ (77) 

where is the radius r ball and is the cylinder of radius base r lying on a symplectic 

2-plane, both in endowed with its standard symplectic structure (64). 

It should be noticed that the conformality condition (76) can also be expressed, for il C 971 and 
a hxed symplectic structure, as 

• Conformality: 

c(Ail) = (78) 

The normalisation condition (77) can also be relaxed by just requiring 

• Weak normalization: 

c(Bi) > 0 and < +oo (79) 

It is a non-trivial fact that the symplectic capacities are invariant under symplectic diffemor- 
phisms. The converse is partly true: a differentiable map ip not necessarily invertible, of (R^”, o;) 
that leaves the capacities invariant is either symplectic or anti-symplectic, namely it satishes 

ip*{uj) = ±0; (80) 

The existence of symplectic capacities is not obvious at all, but it is guaranteed by the validity 
of the symplectic non-squeezing theorem. Conversely, the existence of symplectic capacities 
implies the non-squeezing theorem. Obviously 

c(R^"', ca) = oo (81) 

but for a symplectic cylinder which is also an unbounded set in R^”, its symplectic capacity 
is bounded and given by (77). By contrast, for a cylinder Zr based on an isotropic 2-plane of 
R^"^ the symplectic structure vanishes (by dehnition), therefore 

c(Zr,u) = -l-CXD (82) 
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Based on the results of the non-squeezing theorem one can dehne the lower Gromov width (or 
just “Gromov width”, or symplectic radius) of a subset if C by 

Cmin(if) = sup {Trr^ : ip{Br) C if} (83) 

which means that the lower Gromov width is the maximum radius for which a ball of such a 
radius can be symplectically embedded through via ip into if. Hence, if Cmin(if) = ’"O) then 
cannot be symplectically embedded in if, if r > tq. By analogy, the upper Gromov width (or 
cylindrical capacity) of QJ C R2n 

is dehned as 

c^ax(5J) = inf {ttR^ : ij{^) C Zr} (84) 

y 

where Zr is a symplectic cylinder of base radius R lying on any symplectic 2-plane of R^*^. 
This also means that the smallest radius of the symplectic cylinder inside which QJ can be 
symplectically embedded via the symplectic map ip is R and it is impossible to hnd a symplectic 
embedding of T1 to a symplectic cylinder with a smaller radius than R of its base. Given these 
two dehnitions, and based on the non-squeezing theorem, one sees that the lower Gromov 
width is the smallest possible capacity and the the upper Gromov width is the largest possible. 
Moreover, one can check that their convex combination 

Ct = t Cmax + (1 - ^) Cmin, t G [O, 1] (85) 

is also a symplectic capacity. Hence there is an inhnity of symplectic capacities on R^"^. Despite 
this fact, constructing explicitly such capacities has proved to be a substantial challenge: to 
this date several such capacities have been constructed, such as [83l [93 |98l [99l 11001 l80l 1101] , 
none of which is obvious to either construct or prove that they indeed obey the axioms (75)-(77) 
or (75), (76) and (79). Without going into any details, we can indicatively mention that the 
Hofer-Zehnder capacity for compact, convex if C R^” takes the form of an integral along the 
shortest periodic orbit 70 on the boundary of if of the hrst Poincare invariant pdq familiar from 
Hamiltonian mechanics, namely 

CHz{!d) = f Pi dqi (86) 

7 70 

where the summation convention for i = 1,... ,n has been assumed. 


4.5 Symplectic capacities of ellipsoids and the uncertainty principle. 

Galculating explicitly the symplectic capacities of manifolds, or even subsets of R^"^ has proved 
to be a difficult task. Among the very few cases for which an answer is known, the ellipsoids in 
R^” stand out, because all symplectic capacities have the same value on them. This is straight¬ 
forward for someone to see based on the symplectic non-squeezing theorem. Suppose that we 






have a real ellipsoid, the smallest two axes of which are of equal length 1. Then a sphere of 
the radius I can be barely embedded in the ellipsoid, and the ellipsoid itself barely hts in a 
symplectic cylinder of the base radius 1. Hence the upper and lower Gromov widths of this 
ellipsoid are /, so all its symplectic capacities are proportional to P. 


This argument can be made more concrete as follows [M]. Let A he a. 2n x 2n real positive- 
dehnite matrix, and J be as in (56). The eigenvalues of the matrix JA have the form ±iXk 
with Afc > 0 and are called the symplectic eigenvalues of A. The set A*,, fc = 1,... ,n is called 
the symplectic spectrum of A. Williamson’s theorem states that there is a unique element B 
of the symplectic group Sp{2n, such that 

= (^ 1) 

where the superscript T stands for transposition and = diag(Ai,..., A„). Parametrize 
by the row vector z = (zi), i = 1,... ,2n and consider the ellipsoid 


£ : z^Az < 1 


( 88 ) 


Then for any symplectic capacity c one hnds that 


c(£:) 
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"^max 


(89) 


where 


Amax = {max (Afc), k = l,...,n} (90) 

k 

Hence, from a Hamiltonian/symplectic viewpoint, the most appropriate/natural cells in which 
one should divide the phase space during coarse-graining are the ellipsoids (88). 


It is a non-trivial fact [HS] about which we would not like to elaborate, that the Heisenberg 
uncertainty principle, or its generalisation, the Robertson-Schrodinger inequality 

{Axi^^Apif > {A{xi,pi)y + ^ (91) 

where A{xi,pi) stands for the covariance matrix element, can be re-cast in terms of the sym¬ 
plectic capacities as 

c(W) >1 (92) 

where W is the Wigner ellipsoid associated to the covariances and c a symplectic capacity. This 
falls within the framework of the Wigner-Weyl approach to quantisation which is extensively 
used in the more mathematically rigorous or the quantum/classical interface treatments. In 
more familiar terms, one can see the content of the uncertainty principle as being expressed by 
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the dictum that a function and its Fourier transform cannot be simultaneously sharply localized 
[TO [TUB] . Since we know that coherent states (Gaussians) represent minimum uncertainty 
states, and that they are mapped into Gaussians under the Fourier transform, let’s assume 
that a wave-function 'ip : —)■ C and its Fourier transform are bounded, in modulus, 

by such Gaussians, i.e. 

\ip{x)\ < Cexp{-^Ax‘^), \Hpj]{p)\ < Cexp{-^Bp^) (93) 

where G is a constant and A, B are real symmetric matrices. Then consider the phase-space 
ellipsoid S : Ax"^ + Bp^ < h. The Robertson-Schrodinger uncertainty principle can be 
expressed via a symplectic capacity c by 

c{£) > (94) 

These conditions imply that from a symplectic viewpoint, the appropriate choice of cell that 
should be used for phase-space coarse-graining is an ellipsoid rather than a cube. This is also 
compatible with, if not necessarily dictated by. Quantum Mechanics and provides a justihca- 
tion, in part, for our approach an interpretations presented in the sequel. 


4.6 Symplectic vs Riemannian features. 

It should be noticed that the symplectic capacities for an n-dimensional manifold DJI are gen¬ 
uinely new invariants that cannot be deduced from its Riemannian volume vol DJI by setting, 
for instance 

c(im) = (volDJl)^ (95) 

as this would violate (77) for a symplectic cylinder, for instance. On the other hand, we see 
that if the underlying manifold DJI is compact, then it will have a hnite volume hence 

^miniDJt^ -l“OC ([^1^) 

which is true for all compact manifolds. In the special case of 2 (real) dimensions, it is known 
that all symplectic capacities coincide with the area, as long as the manifold DJI is connected 
and simply connected [10411105] . Hence in two dimensions the symplectic and volume-reserving 
geometries coincide. But two dimensions are special in symplectic geometry. There is a suspi¬ 
cion / conjecture that this statement may be partly true (“all symplectic capacities coincide”) 
for convex bodies in R^” but there seems to be neither a proof nor a counterexample needed 
to resolve it. A fundamental, generally still unresolved, question is whether there exist inter¬ 
mediate symplectic invariants [80]. The simple-sounding question about hnding the necessary 
and sufficient conditions for an ellipsoid to be symplectically embeddable in another is still 
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generally unresolved; an answer became only recently known in 4-dimensions [HT] , 

At this stage, there is an obvious question about the possible relation between Riemannian 
and symplectic invariants of a manifold OJl. As noticed previously, no such obvious relation 
exists for a power of the volume (95). But, the capacities c are a symplectic way of measuring 
the size of a manifold. On top of that, in the phase space of a Hamiltonian system Wl there 
is a Riemannian metric 0 , which is usually deduced from the quadratic term (“kinetic term”) 
of the Hamiltonian. Therefore, one may wish to compare the areas of (real) 2-dimensional 
sub-manifolds QJ C 971 expressed via their embeddings : 9J —)■ 971. Such an area is computed 
symplectically via the pullback of the symplectic form 

As = [ (97) 

or in a Riemannian manner by the area formula |106] 

Ar = [ (p*(0) dvolm (98) 

where we use the volume on 9J induced by the pullback metric given by the embedding (p. It 
turns out that (97), (98) are equal for pseudo-holomorphic curves 9J [8311107] which come about 
due to the extension of (58), (59) to almost complex target manifolds 971, if endowed with a 
tame symplectic structure oj. Such pseudo-holomorphic curves turn out to be minimal surfaces 
in the Riemannian sense, hence can be considered as the analogue of geodesics in symplectic ge¬ 
ometry. As strings move in space-time by “sweeping out” surfaces that should be of stationary 
(“minimal”) area with respect to variations of the area functional, according to the principle of 
“least”/stationary action, the pseudo-holomorphic curves have been of great interest to string 
theory for the last two decades. 

Not much is generally known about the relation between symplectic and Riemannian in¬ 
variants of manifolds. A prominent role in such a relation is furnished by the conjecture of C. 
Viterbo [108] which is formulated for convex domains in Let it be such a convex domain 
and Bi be the Euclidean unit radius ball in (33). For any symplectic capacity c and any 
such convex it one has 

c(it) ^ / vol (it) \ " 

^(^ - Uo/ m) 

where vol denotes the Riemannian volume of the corresponding sets. One immediately sees that 
this conjecture follows the same philosophy as the failed attempt to relate the volume and the 
symplectic capacity (95). The big difference between (95) and (99), is that (99) makes a similar, 
in spirit, statement but expressed in relative terms, i.e. relative to the corresponding quantities 
for a ball. The meaning of this symplectic isoperimetric conjecture is that among all convex 



31 









domains il in with a given volume, the Euclidean ball has the maximal symplectc capacity. 
Such a statement has a clearly isoperimetric flavour (“why is a droplet/bubble spherical”? or, 
among all shapes having a flxed volume, determine the one(s) with the least boundary/surface 
area). Viterbo’s conjecture (99) is not fully proved yet, although weaker versions of it have 
been proved, which rely on inserting a constant A{n) on the right hand side of (99) 


c{a) 

c(Bi) 


< 


A{n) 


/ vol (il) \ " 
\vol {Bi)) 


( 100 ) 


Initially C. Viterbo proved (100) |108j for A{n) ~ n, in particular A{n) ~ 2n for symmetric 
convex domains with respect to the origin, and A{n) ~ 32n for general convex domains. Af¬ 
ter that [T09] improved the estimate to A{n) ~ (logn)^. The best known estimate, to our 
knowledge, was furnished by |110] with A{n) becoming an actual constant, i.e. independent of 
the dimension n altogether, namely A{n) = Aq. It should be noticed that the assumption of 
il being convex is essential: indeed, star-shaped domains were constructed in |111] having an 
arbitrarily small volume but flxed capacity, which violates (99). The conjecture itself is true 
for ellipsoids and convex Reinhardt domains |111] . In the exact opposite direction of Viterbo’s 
conjecture, namely in finding the worst possible symplectic capacities to volume ratios, one can 
see the symplectic cylinders are candidates, since this ratio is zero in their case. 


We see from the preceding analysis that ellipsoids are in some sense minimal and still are 
invariant under symplectic diffeomorphisms. Hence, from a purely Hamiltonian/symplectic 
viewpoint it makes more sense to use as cells in coarse-graining the phase space Tl ellipsoids 
rather than cubes. This is in sharp contrast to the Euclidean/Riemannian viewpoint of the 
previous Section for which, as we saw, cubical cells appear to be the most appropriate for such 
a coarse-graining process. Quantifying aspects of this mismatch between ellipsoids/spheres and 
cubes (or polytopes/convex bodies) to which we will ascribe the origin of entropy, is the topic 
of the next Section. 


5 Basic concepts and implications of convexity. 

In this Section we will be using convexity exclusively in R”'. This is not as big of a compromise 
as it may appear since all manifolds, symplectic or not, are locally isometric to R"^ up to first 
order deviations. Curvature appears as a second order deviation from the Euclidean metric of 
R"^. Hence, if one focuses on local properties of a manifold, understanding convexity of subsets 
of R"" already accomplishes quite a bit. Moreover, as was mentioned above, Nash’s embedding 
theorems show that a generic manifold is not “too different” from a Euclidean space since it 
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can be isometrically embedded in a Euclidean space of high enough dimension. Working in R” 
allows us to use its linear structure to arrive in results that would not otherwise be accessible. 
Since we are working with Hamiltonian systems of many degrees of freedom, we should be able 
to eventually consider the thermodynamic limit. Even though taking such a limit can be a 
non-trivial, model dependent, process, it may not be unreasonable to assume, naively, that it 
is related to the limit n ^ oo. Hence we are interested in the behaviour of convex subsets of 
R"", for n very large. This is the realm of the “local theory of Banach spaces”, or “asymp¬ 
totic geometric analysis”, or “asymptotic convex geometry”. It turns out that there are highly 
non-trivial and unexpected / counter-intuitive results in this realm (such as the “concentration 
of measure”) which lies somewhere between linear algebra (for n:fixed, finite) and functional 
analysis (where one deals with inhnite dimensional spaces). We would like to know about ge¬ 
ometric structures that are, in a sense, “typical”, so they can encode results of importance for 
Statistical Mechanics. The held of asymptotic convex geometry is highly developed. For our 
very limited purposes for this work, we have drawn, in various degrees, material from the books 
|112[ 11131111411115111161111711118] and the reviews |119111201112111122] to which one can turn, 
as well as to the numerous other outstanding references, for details and proofs and authorita¬ 
tive and comprehensive expositions on these topics In the sequel, we will conhne ourselves to 
concepts and results needed in the present work. 


5.1 Convex bodies, polar and fnnctional dualities. 

A convex set /C C R"" is a set such that 

V x, 2 / G /C, tx -\- {1 — t)y G /C, f G [0,1] (101) 

A convex body is a compact, convex subset of R” having a non-empty interior. Hence balls, 
ellipsoids, cubes, polytopes etc. are convex bodies. A fundamental theorem in convexity is that 
Hahn-Banach separation theorem which implies that each convex body K is an intersection of 
half-spaces and that at each point of the boundary dK, of such a convex body there is at least 
one supporting hyperplane. Intuitively, this should be obvious. We will be mostly interested 
in convex bodies that are symmetric with respect to the origin of R”. A convex body K is 
symmetric with respect to the origin of R” if x G /C implies —x G /C. Symmetric convex bodies 
are of great interest for the following reason: consider a finite dimensional normed space X. 
Then it is possible to choose a bijection X —)■ R” such that X can be identihed with (R"', || • ||) for 
some norm || ■ ||. What we have in mind is either the usual Euclidean norm, or a “generalised” 
norm induced by generalized products such as (12) or (21). The unit balls, centered at the origin, 
Ri(X) of such norms are symmetric convex bodies, with respect to the origin. Conversely, for 
any symmetric convex body /C one can assign canonically a corresponding (Minkowski) || • ||yc 
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norm given by 


||a;||j(c = min{s > 0 : a; G s/C} (102) 

whose unit ball is -Bi(X) = /C. As examples, with the notation of (33), one can see that 

is the Euclidean ball, -Bi(/^) is the usual cube (35) whose edge is the interval [—1,+1] 

of length 2 units, so it is symmetric with respect to the origin. Another example, needed for 
our future reference, is the n-cross polytope, which is dehned as the convex hull (97) of the 
endpoints of the unit ortho normal coordinate vectors e^, / = 1..., n in Cartesian basis of 
The metric having the cross polytope and the unit ball is called the Manhattan/taxicab metric, 
or more concretely, the cross-polytope is i/i(/”). The significance of the cross-polytope, for 
our purposes, stems from the fact that it is the polar set of the unit cube i?i(/^). Consider a 
convex set /C C R”. Then its polar set /C° is dehned by 

/C° = {beR^ : (a, 6) < 1, V a e R*^} (103) 

where (•, •) is the Euclidean inner product on R"^. What polarity does is to exchange the faces 

with the vertices. By inspection, a cube in R*^ has 2n faces and 2"^ vertices and the converse is 
true for the n cross-polytope. Upon a more careful examination, one can see that the polar of 
an n-cube is the n cross-polytope. 

There is an equivalent functional way of expressing the above polar facts, mainly because 
R*^ is a linear space. Let X, ^ be Banach spaces, whose corresponding norms even though 
different from each other, will still be indicated by || ■ || for simplicity of notation. Consider a 
linear operator T : X —)■ 2). Such an operator is bounded if there is some constant C > 0 such 
that 

llTxll < C'llxll, VxeX (104) 

Then the inhmum of such numbers is the operator norm of T, indicated by ||T||, so it is dehned 
by 

||T|| = sup = sup ||Ta;|| (105) 

a;gX\{0} ll^ll lh||=l 

In the case that 1^ = 1^ then T is called a linear functional on X. The space of continuous 
linear functionals of X endowed with the operator norm is called the dual space of X and it is 
indicated by X*. The dual space of a Banach space tuns out to be a Banach space too. As is 
well-known from linear algebra such a T is an isomorphism if it is a bijection and in addition 
both ||T|| and ||T“^|| are bounded. Such an isomorphism is an isometry if 

||Ta;|| = ||a;||, V a; G X (106) 

According to the Riesz representation theorem, every element a G X* has the form a i—)■ (a, b) 
for some b G R'^. The unit ball of X* is therefore 

Ri(X*) = {6 G R*^ ; (a,&) < 1, V a G 5i(X)} (107) 
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which due to (103) can be rewritten as 


= (Bi(X))° (108) 

Hence the convex concept of polarity of symmetric convex bodies in corresponds exactly 
to the functional analytic concept of duality of n-dimensional normed spaces. Given the well- 
known duality 

HiT = C (109) 

are see that (108) is the generalisation of the polar relation 

= (i?i(/^))° (110) 

indicated in the discussion proceeding (104). 


5.2 The Banach-Mazur distance. 

We would like to compare, quantitatively, the coarse-graining of phase space by cubes (and 
their images under generalised operations induced by non-additive entropies, which are convex 
polytopes) and the balls/ellipsoids that should be the cells of coarse-graining of phase space as 
dictated by its symplectic structure. This can be accomplished by introducing a distance d in 
the set of all convex bodies. To define such a distance consider two symmetric convex bodies 
/Cl and K .2 of R"'. Then such a distance d is given by 

d(/Ci,/C 2 ) = inf {a/3: /Ci C a/Ca, /Ca C/3/Ci, a > 0, /3 > 0} (111) 

What this essentially states is that the distance between /Ci,/Ca is the smallest number S such 
that if /Cl can be barely inscribed in /Ca, then 5/Ci can be barely circumscribed around /Ca 
and vice versa, with the role of /Ci,/Ca interchanged. This definition relies on dilations of the 
symmetric convex bodies /Ci,/Ca and therefore it is, not surprisingly, multiplicative. To get 
an additive distance, one should take the logarithm of such d. Moreover, and since physical 
theories use extensively metric concepts, it may be worthwhile comparing ( 111 ) with the well- 
known Hausdorff distance as well as the numerous other distance functions that someone can 
come up with, many of which are disucssed in HU. 

Consider the n-dimensional spaces Xi,Xa whose unit balls are /Ci,/Ca respectively, as ex¬ 
plained the previous subsection. Then, following (111) their Banach-Mazur distance is defined 
as 

d(Xi,Xa) = inf{d(/Ci,T/Ca) : T G GL{n,R)} (112) 
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where T is an operator which is an element of the general linear (Lie) group GL{n, R). Essen¬ 
tially what T does is to move around “rigidly” /Ci, IC 2 until their mutual distance (111) becomes 
as small as possible, i.e. one of them “barely htting” inside and outside the other, and vice versa. 
This can be expressed more succinctly by stating that d is such that /Ci C T{fC 2 ) C dlCi for 
any T G GL{n, R). Since, as seen in the previous subsection, there is an intimate link between 
convex geometry in R” and functional analysis, one can further translate the Banach-Mazur 
distance (112) in the functional analytic language as 

d{Xi,X 2 ) = min{||T|| ■ ||T“^||, T : Xi —)■ X 2 isomorphism} (113) 

which is a form of the Banach-Mazur distance between normed spaces also frequently encoun¬ 
tered. 


5.3 The Banach-Mazur distance between a sphere and a cube. 


Consider now the cube = [—1, -1-1]"' in R". It is easy to see that one can inscribe in it a ball 
of radius 1 and can circumscribe around it a ball of radius ^/n, for any n, and one cannot do 
better than that. Hence, it is intuitively obvious that the Banach-Mazur distance between a 
cube and a ball is i/n. Therefore as the dimension increases the cube looks less and less like 
a ball: the vertices of the cube “move” further and further away from the centre of the ball, 
assumed hxed, or conversely the ball “curves more” as n increases and therefore it becomes 
smaller and smaller inside a fixed cube. To make this a bit more precise, one can start form 
the well-known formula for the volume of the Euclidean unit-radius ball in R" 


vol Bi 


n 

712 

r(i + i) 


(114) 


where T stands for the (Euler) gamma function. Using Stirling’s approximation 


provides the estimate 



(115) 


(116) 


Therefore one can see that due to the curvature of the surface of the ball, the ball of unit radius 
has a volume that approaches zero very fast as n ^ 00 . This is in sharp contrast with the case 
of the cube In whose volume increases as n —)■ 00 . Hence it is expected that the distance be¬ 
tween the cube and the ball of unit radius will increase as n increases. Not only that, but such 
a distance is as large as possible for the ball and the cube, a direct outcome of F. John’s theorem. 
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Based on the above and from a metric viewpoint, as was also previously remarked, coarse- 
graining the phase space fOt with balls instead of cubes (or vice versa) should provide the 
greatest possible discrepancies than the coarse-graining of 971 with any other regular polytopes. 
But entropy, and this is probably clearest in the dehnition of the Kolmogorov-Sinai case [65], in¬ 
volves considering the supremum over all possible phase space partitions. Since coarse-graining 
involves a piece-wise (for each cell) constant probability density pcg, phase space measures are 
constant multiples, per cell, of their phase space volumes. As a result, substituting small enough 
cubic cells for balls (and vice versa) in coarse-graining will provide the maximal possible mea¬ 
sure discrepancy, namely the maximal possible phase-space volume loss, which is what Sbgs 
has been designed to capture. 

It should be noticed that the polar duality is of fundamental importance in asymptotic con¬ 
vex geometry hence it may be desirable to compare pairs of polar dual polytopes rather than 
single convex bodies. In this respect the Euclidean ball is unique in that it is the only is self-dual 
polytope under polarity. If it is pairs of polar dual polytopes that are of fundamental interest, 
the one should also consider the Banach-Mazur distance between i?i(/ 2 ) ci-nd the cross-polytope 
It should also be intuitively obvious that such a distance is n and this is as bad as 
things can get. Intuitively the cross-polytope and the cube are the “pointiest” of all convex bod¬ 
ies hence their distance from the “roundest” of all of them, which is the ball, should be maximal. 

For our purposes the ball, or more generally ellipsoids, are a manifestation of the symplectic 
structure of R"' and the cube encodes geometrically the concept of independence, as seen in 
the previous sections. Since however we would like to allow for generalised concepts of “in¬ 
dependence” induced by entropies such as iSg, etc. whose induced “cubes” are, in general, 
symmetric polytopes/convex bodies what we would like to ask is what is the Banach-Mazur dis¬ 
tance between convex bodes and ellipsoids. And since actually calculating the Banach-Mazur 
distance between convex bodies has proved to be quite hard, in general, one can settle either by 
asking for upper bounds for that distance or for asymptotic estimates as n —)■ cxd (the “thermo¬ 
dynamic limit”). That there is a unique ellipsoid of maximal volume that can barely £t inside 
a convex body is guaranteed by F. John’s theorem where conditions for this maximal ellipsoid 
to actually be the Euclidean sphere are also spelled out. 

Another result, intuitively plausible, if not obvious, is that a reasons why a sphere and a 
cube are so different is that the cube has too few faces. What would happen if one allowed for 
a symmetric convex body with far more faces? Then the answer is that its distance form the 
sphere should decrease. And this is actually what is happening but the crucial matter, for our 
purposes, is that the increase has to be exponential in terms of the dimension. More precisely. 
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consider a symmetric convex body /C C R" such that d{]C, = d. Then /C must have 


at least exp(^) faces. Since the distance between the cube and the ball is ^/n, a cube in 
R™ has almost spherical sections of dimension logm. A substantial generalisation of this fact, 


Dvoretzky’s theorem, we will be used in Section 6. 

Having stated that the distance between a cube an Euclidean ball is the maximum pos¬ 
sible, we may turn to a desirable consequence already embedded in the dehnition of entropy 
EiaE]. Entropy, like any thermodynamic quantity, involves ignoring a lot of the details of 
the underlying dynamical processes as noticed earlier in this subsection. All such details are 
contained in the phase space evolution of the underlying dynamical system. Hence skipping 
many of these details, for small enough cells in a coarse-graining of phase-space, is tantamount 
to glossing over or even ignoring a substantial volume of the cells, if we think that such an 
omission will not make, statistically, any substantial difference. Since Sbgs describes systems 
having a simple phase space evolution according to Birkhoff’s ergodic theorem [65], hence the 
micro-canonical density p is uniform on a constant energy hypersurface of the isolated system, 
we can simplify our considerations and use balls instead of cubes, as the distance between them 
is the largest possible, for coarse-graining. By doing that, we have to ignore most of the volume 
of these cubes, thus simplifying the description, without losing much, due to the uniformity of 
the micro-canonical measure and its proportionality to volume within each cell. 

This procedure is effective due to the fact that the majority of the volume of the cube 
In is close to its vertices, something that can be justihed as follows: place the cube’s centre 
of symmetry at the origin of R"^ and express its volume in spherical coordinates. Let let 9 
collectively express the angular coordinates, parametrising the unit sphere 5'"'“^ and r{d) be 
the radial coordinate. Then 



(117) 


where dS expresses the inhnitesimal area element of the surface of In- Since is a cube of side 
length 2, its volume is 2”. Therefore 



(118) 


which, after using (112) gives approximately 



(119) 


This corresponds to a cube of average radius about \/2nl'ne. Given that the distance between 
the vertices of the cube and its centre is \/n, and that centre of each face is 1 unit away from 
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its centre, we conclude from (119) that most of the volume of the cube is closer to its vertices 
than to the centre of its faces. Hence substituting the cubes by Euclidean balls inscribed 
in them, hence of unit radius, omits most of the volume of the cube, which however is rather 
innocuous, as can be inferred from the above discussion. 

We should be very careful however, since such an argument may not necessarily apply for 
systems described by or any of the other non-additive entropies. Such systems will have 

more complicated phase space behaviour, potentially attracting sets and the like |123] . which 
a regular measure p may not be able to adequately describe. As the case of the Sinai-Ruelle- 
Bowen |124j measures indicates, one may have to use measures some of the marginals of which 
may be nowhere absolutely continuous with respect to the phase space volume, thus invalidat¬ 
ing the above arguments and complicating substantially the coarse-graining process. 


5.4 Unit Euclidean balls and Gaussians. 

There is another reason for considering, and quantifying, the discrepancy between balls/ellipsoids 
and cubes/convex bodies/polytopes. Consider the, arguably, simplest system of many degrees 
of freedom: the classical (non-relativistic) ideal gas of N identical particles of mass 2 units 
which is placed inside an isolated cubical box of side length L. The phase space of this system 
factorizes as x The Hamiltonian is 

N 

n = (120) 

i=l 

Given that the system is isolated, its total energy is conserved. Set it equal to unit for simplicity. 
Hence the momentum space reduces from to the unit sphere The thermodynamic 

limit corresponds to taking N ^ oo which gives rise to the Maxwell distribution, which is Gaus¬ 
sian with respect tot the molecular speeds. Probabilistically, this is a simple manifestation of the 
Gentral Limit Theorem. This result, interpreted geometrically, shows that a unit radius sphere 
of high dimension (A^ ^ cxo) should be an excellent approximation to the Gaussian distribution. 

That this is indeed true had been stressed by E. Borel and later by P. Levy. More recently, 
it has been emphasized in the work of V.D. Milman and M. Gromov. The argument can be 
made more precise as follows HE]: consider the Euclidean ball of radius R in R"^. Due to 
the homogeneity of volume, its volume is 

vol vol Bi (121) 

where Bi is given by (114). Assume that volB^ = 1. Hence 

R = {vol Hi)-- (122) 
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Hence a section of this ball of codimension 1 passing throngh the origin, will be an n — 1 
dimensional ball of radins R whose volnme is according to (118) and (119) 

vol = vol {vol (123) 


After nsing Stirling’s approximation (115), we see that for large n this section has a volnme 
approximately eqnal to s/e. Consider now an n — 1 dimensional section of this ball at a distance 
r from the ball’s center. Its radius will be {R^ — r^)^. As a result its volume will be, for large 
n, approximately given by 






(124) 


Following (116) we can see that the a Euclidean ball of volume 1, for large n, has a radius of 
about _ 


Substituting (125) in (124) gives for the volume of the spherical section at distance r from the 
origin 

n —1 

( OjrpT^ \ ^ 

1-j ~ \/e exp(—vrer^) (126) 

So, the projection of the unit volume of the ball in a spherical section of co-dimension 1, which 
is at a distance r from the ball’s centre, for large n, is almost a Gaussian distribution of variance 
What we have done is a geometric re-formutation and “derivation” of the Central Limit 
Theorem. 


What we also observe in the above derivation is that most of the volume of the ball concen¬ 
trates in a lower dimensional section passing through its center. This observation turns out to 
be independent of the validity of the Central Limit Theorem, even if this is not obvious from 
the above arguments. Counter-intuitive as it may be, it is frequently encountered in convex 
asymptotic geometry, i.e. in Banach spaces as their dimension increases to infinity. Exten¬ 
sions for the cases of Riemannian manifolds with a Ricci curvature bounded from below or in 
terms of the behavior of the lowest non-trivial eigenvalue the spectrum of their Laplacian also 
exist. The same can be stated for smooth metric measure spaces under additional assump¬ 
tions and generalizations of the definition of convexity. This underlying behavior has been 
called ’’concentration of measure” |125[ I126( I127[ I128[ 1114111291 ITT] 113011118] and was the main 
avenue for V.D. Milman proving Dvoretzky’s theorem which we will come to in the next Section. 
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5.5 About “quantum blobs”. 

In the previous Sections we saw that there is a substantial discrepancy between the coarse- 
graining approach that rely on cubes which are the outcome of independence/simplicity argu¬ 
ments and balls/ellipsoids induced from symplectic capacities. Ultimately one would like to 
have a “natural way”, to the extent possible, to decide on such coarse-graining. The existence 
of the fundamental constant h arising in Quantum Physics, partially helps provide a scale for 
such a phase-space coarse-graining, but the exact shape of the fundamental cell employed is 
still a matter of choice, as previously has been pointed out. Since balls/ellipsoids minimise all 
symplectic capacities and their high dimensional sections are Gaussian, as seen in the previous 
subsection, one may be willing to use them as the fundamental cells for phase-space coarse- 
graining. This would be favourably supported by Quantum Mechanics where the minimum 
uncertainty, hence “as precise as possible”, wave-functions for quadratic potentials, which are 
the lowest order approximations to any “generic” analytic potentials, are Gaussians. See also 
the discussion in subsection 4.5. In this subsection we rely on the work of M. de Gosson and 
collaborators |86l [871 |88l |89l [90l [HU |92l |93l jHU [95] where many more details on simiar topics 
can be found. 

“Quantum blobs” are, very much like Gaussians, minimum uncertainty sets whose size is 
measured using symplectic capacities instead of volumes. Their advantage, as opposed to cubic 
cells, is that they remain invariant under canonical transformations, hence they preserve the 
Hamiltonian structure of the dynamical system. To be more specihc, we work in and 
define a quantum blob Q^"'(a;o) to be the image of the Euclidean ball B^{xq) C R^"' under 
a canonical/symplectic transformation. Using the results of subsection (4.5) we see that for a 
quantum blob 

c(S“(:ro)) = U (127) 

which has the same order of magnitude as the symplectic capacity of a cube. By contrast, since 
symplectic transformations are volume-preserving, the volume of a quantum blob, following 
(118), is 

Un 

vol (Q^-((ro)) = ^ (128) 

As result, this volume is n!2"' times smaller than that of a cube. Since we are interested in the 
thermodynamic limit n —)■ cx), the quantum blob has far (“inhnitely” ) smaller volume than that 
of a cube, despite the fact that its sections along all symplectic 2-planes are of comparable area. 

The above considerations of cubes versus ellipsoids as fundamental cells of phase-space 
coarse-graining raise another question. If the two-dimensional sections of these two candidates 
for fundamental cells are of almost the same area but their volumes are so different, is there any 
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intermediate situation? Given that quantum blobs have a substantially smaller volume than 
the corresponding cubes and that inside each cube, or polytope, there is a maximal volume 
ellipsoid (F. John’s theorem), is there any intermediate dimensional section of the cube which 
is close to being a ball/ellipsoid? 


6 Dvoretzky’s theorem, Dvoretzky dimension and dualities. 

In this Section we present Dvoretzky’s theorem and the Dvoretzky dimension, and its implica¬ 
tions for the dehnition of entropy. We do not claim that we can actually predict the functional 
form of entropy to be used in each occasion, as would be desirable, if feasible, in our opinion. 
However we can at least see that the choices of Sbgs and Sg are plausible, asymptotically, for 
the case of the “thermodynamic limit” (n —)■ cxd) seen through a perspective/viewpoint induced 
by Dvoretzky’s theorem and its implications. The choices of the appropriate deformation fields 
such as Rg and R«; contribute in determining the shape of the fundamental cells in which 
the phase space OJt of the Hamiltonian system should be divided in a coarse-graining process. 
However such a choice, if it is also followed by the other axioms of Shannon / Khintchin or 
Abe / Santos etc amounts to the choice of the entropic functional employed in any particular 
situation. Still the viewpoint that we follow in this Section may be useful in seeing the entropy 
under a different light, which may allow inferences that may not be easily accessible otherwise, 
such as the origin of dualities in non-additive entropies etc. 


6.1 Dvoretzky’s theorem, Dvoretzky’s dimension and entropies. 

The above considerations of cubes versus ellipsoids as fundamental cells of phase-space coarse- 
graining raise another question. If the two-dimensional sections of these two candidates for 
fundamental cells of phase-space coarse-graining are of almost the same 2-dim “area” but their 
volumes are so different, is there any intermediate situation? Given that quantum blobs have a 
substantially smaller volume than the corresponding cubes and that inside each cube or convex 
body/polytope there is a maximal volume ellipsoid (F. John’s theorem), is there any inter¬ 
mediate dimensional section of the cube which is close to being a ball/ellipsoid? We saw in 
subsection 5.3 that for a symmetric convex body to be close to a ball, it must have exponen¬ 
tially many faces. Equivalently, a cube in R” has almost spherical sections of dimension at 
least logn. Hence, if the dimension of a section of a cube is larger than logn then it can be 
reasonably close to a ball, therefore the discrepancy between phase space coarse-graining by 
cubes and by balls can be seen as non-significant. On the other hand, as seen in subsection 
5.3, if such a section has dimension much larger than logn, then the cube and the ball will 
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be substantially different from each other in their volumes, hence a coarse-graining procedure 
would give quite different results for these two fundamental cells. Since cubes and balls are as 
distinct from each other as possible as measured by their Banach-Mazur distance, getting the 
same coarse-graining results for both of them can be seen as relatively re-assuring that we will 
get the same results by using as fundamental cell of phase space any other symmetric convex 
body (polytope). 

From a physical viewpoint, one can interpret this way of thinking to mean that logn as 
n —)■ cxo is, asymptotically, the optimal order of magnitude dimension that one can use in order 
for the coarse-graining results of phase space to be virtually independent of the exact shape 
of the fundamental cell. Hence the statistically important characteristics of the underlying 
dynamical system would be preserved, geometrically, but there would still be a reduction in the 
complexity as measured by the number of the effective degrees of freedom of the system. But 
this is exactly what the entropy was designed to capture. Of course, the entropy is associated 
to a measure, which may not be just the volume. However, if one assumes the validity of the 
ergodic hypothesis, the micro-canoncal density is uniform on the constant energy hyper surface 
under consideration, as was also previously pointed out (subsection 5.3). Hence all the argu¬ 
ments pertaining to the micro-canonical measure are reduced to the corresponding ones about 
volumes. Hence, in the most conventional sense, the entropy should have the form of a natural 
logarithm of the accessible phase space, in accordance with Boltzmann’s, Gibbs’, Shannon’s etc 
ideas. 

One wonders on whether anything similar can be stated about non-extensive Statistical 
Mechanics and the non-additive entropies such as Sq and S^- After all, the underling geometric 
structures are almost the same, the only difference being that cubes will have to be replaced by 
their images under (14), (23) and their generalisations, generalised shapes which are symmet¬ 
ric convex polytopes. The underlying symplectic structure, by contrast, remains the same as 
Hamilton’s equations are still assumed to be applicable for such systems (see also the related 
comment in subsection 3.3 after eq. (38)). Hence the question that arises, in analogy with the 
cube, is whether there is any dimension beyond which a symmetric convex body (polytope) has 
almost spherical sections. There is an additional level of difficulty however: it is not clear that 
the systems described by Sg or are ergodic. On the contrary, it has been conjectured that 
these non-additive entropies describe exactly non-ergodic systems (see also the last paragraph 
of subsection 5.3). Even though any “reasonable” measure admits a decomposition into ergodic 
components [65], this is not enough to justify the reduction of the phase-space measure to just 
the volume. To proceed, we assume that each phase space cell is so small that such a reduction 
is possible. This can occur, for instance, when the measure density variation is slow, when 
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compared to the spatial extent of each cell. 


To make the question more precise and general, assume that X is an n-dimensional Ba¬ 
nach space whose unit ball expresses the generalized independence induced by some non¬ 
additive entropic functional. Then, is there a subspace of X of dimension k{e,n) such 
that d{E, < 1 + e, for e > 0 ? In this expression d stands for the Banach-Mazur distance 
between E and the fc-dimensional Euclidean (Hilbert) space Slightly more geometrically, 
one can start from the symmetric convex body K in R"' which expresses the afore-mentioned 
independence. Does there exist a section X fl E of X by a subspace E of dimension k{e,n) 
so that if S is an ellipsoid, one has the inclusion £^CXnEC(l-|- e)£7 It should be no¬ 
ticed that the above questions are asymptotic, in the sense that n is large, namely n —)■ cxo. 
The answer to these equivalent questions is affirmative as was proved in 1961 by A. Dvoretzky 
[114111181111911125( 11261112111122] a result which is one of the cornerstones of Geometric Func¬ 
tional Analysis / Asymptotic Convex Geometry. The result is extremely counter-intuitive as 
someone can easily see by trying to imagine a section of a cube which is almost spherical. Such 
a result is possible exactly because the dimension n is assumed to be quite large. 

As it behts a fundamental result, there are different re-formulations in slightly different 
contexts under the general title of “Dvoretzky’s theorem” and several alternative proofs. One 
of the still unsettled questions is the optimal form of the Dvoretzky dimension k{e,n) in the 
above statements. The best known estimate is 

k{e,n) > ce^logn, e G (0,1) (129) 

The case of our interest (see also the discussion in the next subsection) involves the n-dimensional 
Banach spaces endowed with the norms (30), (31) which were indicated as /p, 1 < p < oo. It is 
a non-trivial result that the corresponding Dvoretzky dimensions k{lp) are given asymptotically 
n —)■ cx) by 

{ n, 1 < p < 2 

p rip, 2 < p < oo (130) 

logn, p = oo 

We observe that when p > 2 then we have a power-law behaviour. If the Dvoretzky dimension 
of a cube k{l^) is responsible for giving rise to the logarithmic form of the entropy as we previ¬ 
ously suggested, then one can see that power-law entropic forms such as Sg and may be seen, 
at a hrst glance, as arising from the Dvoretzky dimension of Z”, p > 2. Even though this would 
not be exactly correct, in the face of the comments of the next subsection, at least it may be 
considered as suggestive, and indicative of yet not fully appreciated underlying structures. We 
observe in (130) the remarkable property that Zc(Zp) ~ n, 1 < p < 2. Pushed to its limit, and 
substituting q for p in (130), if the above discussion is pertinent for Sg, then the assumed range 
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of entropic indices q G (0,1) would cover all possible cases for the leading power-law behaviour 
of the entropic functional. If the conclusions of [SHlEn] prove to be correct, the range q G (0,1) 
in Sq is the only acceptable one for the entropic index related to a Hamiltonian system of many 
degrees of freedom. Similar things can be probably stated about the entropic parameter n and 
the entropic functional S^- 

Potentially existing sub-leading terms in the non-additive power-law functionals cannot be 
detected by the asymptotic form (130), so far as we can see. It is probably exactly such 
sub-leading terms that determine the differences between functionals such as Sq and and 
therefore these terms may turn out to able to distinguish which one of these functionals is more 
appropriate for describing which system. Still, referring to more general systems described by 
the above two, or any other, entropic functionals of power-law form, we are not quite certain 
about how to interpret the signihcance, if any, of the “phase transition” in the Dvoretzky di¬ 
mension of Ip exhibited between p G [1, 2] and p G (2, oo). 

6.2 On the (e, r) entropy and the nse of spaces. 

It may be of interest to elaborate a bit upon the dynamical origin and role of e appearing in 
Dvoretzky’s theorem. This parameter expresses, as was previously discussed in subsection 3.3, 
our fundamental inability to follow the evolution of the underlying dynamical system with ab¬ 
solute precision. One way to quantify this, is through a modihcation of the Kolmogorov-Sinai 
entropy [65] such as the one presented in |131] called e-entropy: in the case of the Kolmogorov- 
Sinai entropy, one introduces partitions/cells of the phase space of size e over which one dehnes 
Sbgs- Eventually, one takes the size of the partition e ^ 0. By contrast in the e-entropy the 
size of the fundamental cell remains hnite, but all other steps as the same as in the case of the 
Kolmogorov-Sinai entropy construction. One can refer to [131] for the advantages and difficul¬ 
ties that such a dehnition implies. What is important, for our purposes however, is that in this 
dehnition the hnite, even if small, phase-space “resolution” e enters explicitly the dehnition en¬ 
tropy. Hence this parameter will invariably appear, explicitly or implicitly, in the composition 
properties such as (8), (9) or (19) and therefore R, Rg or R^ respectively. Since one dehnes the 
fundamental polytopes of phase space coarse-graining, which express geometrically “indepen¬ 
dence”, via these algebraic operations and structures, the shape of these cells/polytopes will 
invariably contain a dependence on e. Therefore there will be some hnite uncertainty about 
the exact shape of the convex body that one should use in Dvoretzky’s theorem, based on the 
above physical arguments. This uncertainty is quantihed and appears in Dvoretzky’s theorem 
as the upper bound requirement of the Banach-Mazur distance in the theorem and explicitly 
in the Dvoretzky dimension K{e,n) as its dependence on e. In this paragraph, we have used 
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e twice, in two quite different contexts. This is clearly a substantial abuse of notation: even 
though one may possibly expect to find a relation between the indeterminacies expressed by e 
in these two contexts, one should not assume that they are equal, let alone the same. 

A second point that may be clarifying, is the extensive use of the 1 < p < cxo, Banach 
spaces in arguments (subsection 3.2 and forth) in this work. One reason for such use is that a 
lot of their features are reasonably well-understood, when compared to general Banach spaces. 
This however does not make them physically more relevant, just formally more tractable. Based 
on Sbgs and Sg, it appears that the concept of independence as expressed by cubes, namely 
the unit ball of /^, should be sufficient for our purposes. After all, the field isomorphisms (14), 
(23) are invertible and distance non-decreasing. So they preserve the number of vertices of 
these cubes. All that they do is to uniformly expand the sides of the fundamental cubical cells 
of the phase space partition used in coarse-graining. In addition, any legitimate cube should 
have clearly defined vertices. However, all , 1 < p < oo have unit spheres that are everywhere 
differentiable, hence they do not possess any clearly defined (point) vertices. 

To address the last concern, one should refer to the ideas leading to the use of the (e, r) 
entropy in the first paragraph of this subsection. In realistic models, even classical ones, there is 
always some uncertainty associated to the scale of phase-space coarse-graining e. This should be 
reflected on the composition property of the pertinent entropic functional. This indeterminacy, 
in turn, makes the concept of independence become a bit “vague”: the corresponding cubes 
do not have well-defined vertices and faces, but rather areas of small but hnite “thickness” as 
faces, and areas of small spatial extent / “radius” as vertices. Hence one should not be able to 
distinguish between “cubes” in for, let’s say, p = 1 and for p = 1 -|-h, where 0 < 5 -C 1. A 
second reason in favour of using nn only the spaces but also more general Banach spaces in 
employing Dvoretzky’s theorem for physically relevant cases is that we want to have a formal¬ 
ism that is flexible enough to accommodate, many families of entropic functionals. If the price 
that one pays for such a flexibility is small, then one is willing to go along with a slightly more 
elaborate, but far more general, formalism to accomplish these ends. Consider, for instance, 
models of highly anisotropic systems, of as practical as materials possessing layered structures 
ra. or as exotic and conjectural as of anisotropic (Hofava-Lifshitz |133j etc.) gravity. Then 
it may not be entirely unreasonable to propose a direction-depedent entropic form for such 
systems, as long as there is a relatively clear separation of the dynamics and scales in the 
different directions. This would elevate the non-extensive parameters of entropic functionals 
such as Sg,Sf^ into vectors. If this is true, then the corresponding unit cells, expressing inde¬ 
pendence, in phase-space coarse-graining will be anisotropic convex bodies which however can 
still be accommodated by the convexity formalism presented here. To this date though, there 
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has not been any compelling theoretical reason to introduce any such vector-valued entropic 
index functionals, so far as we know. 


6.3 Polarity, Mahler’s conjecture and symplectic rigidity. 

An issue that may be worth discussing is that of polar duality. From a geometric as well 
as analytical viewpoint, polarity has turned out to be of considerable signihcance, since the 
earliest days of Euclidean geometry. The functional analytic analogy of the polar of a convex 
body /C° with the dual space X* presented in subsection 5.1 has important and far-reaching 
consequences. One of them is that the Banach-Mazur distance remains invariant under such a 
duality, namely for any normed spaces the Banach-Mazur distance obeys 

d{X,^^) = d{X\^*) (131) 


An immediate implication is that since, according to John’s theorem 


= Vn 


(132) 


and because (/”)* = and the Hilbert space is self-dual (the Euclidean ball is the polar of 
itself), one also has 

d(/",/”) = (133) 

Since the different /” via their unit balls express different ways of dehning independence, induced 
by the various non-additive entropies, one also needs the extension of the above to 

d{i;x^) = (134) 


where either 1 < p < q < 2 or 2 < p < q < oo. The remaining option, namely 1 < p < 2 < 
q < oo gives only bounds for the Banach-Mazur distance as 


Cm^ < d{i;,l^) < C 2 n^, /3 = max 


1111 
p 2’ 2 g 


(135) 


with Cl, C 2 being positive constants. Going back to Dvoretzky’s theorem, the Figiel-Lindenstrauss- 
Milman theorem provides for the Dvoretzky dimension /c of a Banach space X and its dual X* 
the lower bound 

(136) 


k{X) k{X*) > 

which since d{X, /^) < \/n gives that 


d ( X ,/^)2 


k{X) k{X*) > C'n 


(137) 
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where C and C are positive constants. As a result, for any such Banach space X, one has that 
either k{X) > Cy/n or k{X*) > Cy/n, a result that turns out to be sharp. 

We refer to dualities in this work, not only because they play an important role in Convex 
Geometry, but also because they may be of importance for the case of non-additive entropies. It 
has been surmised from some data, for Sq for instance, that some systems seem to be invariant 
under the entropic parameter changes 

1 

q I—)■ 2 — g, q \ —)■ - (138) 

q 

Whether this actually happens, and if so what is the origin of such invariances is still an open 
question. The transformations (138) are the generators of Mobius transformations for g G C. 
Clearly the case g = 1 in (138) which corresponds to Sbgs remains invariant under such trans¬ 
formations, hence issues that can be raised for g 7 ^ 1 pertaining to (138), are undetectable 
for Sbgs- Convex polarity of the unit balls and the corresponding Banach space duality may 
somehow be related to (138) in a currently not understood manner. However a pattern that 
starts emerging that the above considerations are suggestive of, is that it may be worthwhile 
to analyse in parallel, and compare to each other, features of systems whose entropic indices 
are connected by some duality transformation. Assuming that such systems are described by 
different values of the entropic parameter of the same single-parameter family of functionals 
such as Sq or iS^, it may be worth investigating what features of such systems that are common, 
or in some, still vague, sense “opposite” / “complementary”. 

To push this viewpoint a little bit further, it may be worth examining concurrently from 
both a convex and a symplectic geometric viewpoints properties of the unit balls of dual to 
each other hnite dimensional Banach spaces X endowed with the norm || ■ || and X* endowed 
with the dual norm || ■ ||*. To this end, one may consider examining convex and symplectic 
geometric properties of properties of X x X*. A straightforward observation is that this vector 
space has a canonical symplectic structure: assume that X,Y E X and that X*,Y* G X* are 
their respective duals. Then the canonical simplectic structure a; on X x X* is defined by 

uj{{X,X*),{Y,Y*)) = X*{Y)-Y*{X) (139) 

and the corresponding Liouville form of X x X* is, of course, u'^/nl . We saw during all this 
work t hat the Euclidean ball and the cube are, in some sense, as different from each other as 
possible, and that even though the former behaves quite well under symplectic transformations 
the same is not true for the latter. Therefore, it may come as a complete surprise that for the 
case of the cube C and its polar, the cross-polytope, 1° C R”, the interior of x 1° 
turns out to be symplectically diffeomorphic to the interior of the Euclidean ball in R^'^ of the 
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same volume [7^. This again shows the unexpected features of symplectic geometry where 
flabbiness and rigidity can be found in totally unexpected places. 

On the geometric side, a question in the spirit of the isoperimetric problem [TH 111411134] 
which was posed by Mahler (ca. 1939) was to hnd upper and lower bounds for the Liouville 
volume of Bi x where Bi, B^ stand for the unit balls of j£ and X* respectively. This volume 
vol{Bi X often called “Mahler volume”, is invariant under linear invertible transformations. 
The upper bound was determined in 2 and 3 dimensions in |135] , and generalised in any dimen¬ 
sion |136j . if and only if X is the Euclidean space (Blaschke-Santalo inequality). The equality 
was proved in |137] . Mahler conjectured that the lower bound is 4”/n! and would be sharp. 
This lower bound would clearly apply for the pair of the cube its polar, the cross-polytope. 
Mahler himself verified this conjecture in 2 dimensions but in higher dimensions the conjec¬ 
ture remains unproven. What has been proved though is the conjecture up to multiplicative 
factor whose best value known today is given in [138] . In an interesting recent development, 
|13911140] assumed the validity of the Viterbo conjecture (eqs.(99), (100) and the discussion 
around them), and proved that the Hofer-Zehnder capacity for a symmetric convex body /C 
and its polar /C° is 

c(/C X /C°) =4 (140) 

which, in turn, showed that the Viterbo conjecture implies the validity of the Mahler conjecture. 

This subsection used some symplectic and convex geometric facts and conjectures to suggest 
that it may be formally fruitful for someone to look at the same time at pairs of systems, rather 
than single systems, described by entropies, belonging to the same single-parameter family 
but having harmonically conjugate indices (which represent geometrically polarity and Banach 
space duality). It remains to be seen whether this approach may provide some insights into the 
nature of such systems as well as about the possible invariances and their origin, of non-additive 
entropies such as Sq under “duality” transformations such as (138). 


7 Conclusions and discussion. 

In this work we presented the view that the source of entropy can be ascribed to two mutually 
exclusive ways of performing phase space coarse-graining. for Hamiltonian systems with many 
degrees of freedom. The underlying Euclidean/Riemannian structure favours cells that are cu¬ 
bical. By contrast, the symplectic structure favours ellipsoids. We discussed ways to measure 
the discrepancy of these two disjoint approaches and also gave estimates, via Dvoretzky’s di¬ 
mension of minimal dimensions spaces in which they give almost the same results. So, it is 
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in some sense, as if there is a set of variables present at the microscopic level that reflects the 
number of variables that one sees as the outcome of a statistical analysis, i.e. in thermody¬ 
namics. We cannot quite dare claim that these variables are the same, or even more so, that 
there is a “phase-space thermodynamic” behaviour. It just appears from Dvoretzky’s theorem 
that at the microscopic level one can infer a number of variables that are of the same order of 
magnitude as the ones needed for a macroscopic descriptio of the system. Further investigation 
in this direction may be of some interest. Moreover, we saw some preliminary formal indica¬ 
tions about the suspected presence and about the origin of dualities of non-additive entropies 
via polarity. 

One could certainly expand this work in both the symplectic and the convex geometric 
directions, if deemed necessary. The symplectic capacities are still not very well understood 
objects. In case this looks too removed from the modelling of physical systems, it may be worth 
mentioning that there is an elaborate and flourishing research area on the interface between 
symplectic geometry and string theory. Even though the goals and approaches in this area 
may appear substantially different from the ones of Statistical Mechanics, some general ideas 
and technical approaches especially pertaining to dualities |1411114211143] may be prohtably 
adapted in the present context. After all, quantum perturbative string theory, like any quan¬ 
tum theory, has a statistical interpretation and explicitly uses methods of statistical mechanics 
relying on Sbgs- To what extent one may wish to consider other functionals in such a sta¬ 
tistical approach is unclear. However, given string theory’s origins in dual resonance models 
that were eventually superseded by QCD which is asymptotically free and has a Wilson loop 
formulation |144j . shows us that low energy correlations become dominant, a feature of systems 
that non-additive entropies such as Sq claim to describe. Hence it may be worth looking into 
string theory from an non-additive entropy viewpoint. There is more than just pure specu¬ 
lation on this front: using the phenomenological asymptotic bootstrap approach of Hagedorn 
for strong interactions, some recent results suggest an important role that Sq may play in this 
regime. Such phenomenological approaches relying partly on Sq, seem to be, most importantly, 
in accordance with existing experimental data [1451114611147111481114911150] . 

One can also use several well-known results of convex geometric analysis, such as the Hour- 
gain distortion theorem or the Johnson-Lindenstrauss flattening lemma etc. to expand upon 
the results that just used Dvoretzky’s theorem and the associated dimension [113] 1115] 1116] 
111811121] 1122] . Whether such results can be generalized and can lead to interesting conclusions 
pertinent to non-additive entropies is not clear in our mind. However, the thought of using 
a physical idea, such as a non-additive entropic functional, to potentially help prove a purely 
geometric conjecture such as that of Mahler, is probably too enticing to not motivate someone 
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to look carefully and further develop the symplectic/asymptotic convex point of view. 

In closing, and from a formalistic viewpoint, one could not avoid mentioning a trend toward 
categorihcation that exists in some mathematical quarters. Such categorihcation may provide 
a formalism that may be able to bring forth unexpected aspects of non-extensive statistical 
mechanics, and touches upon on some aspects of topics discussed in this work. One application 
of this categorihcation that has touched upon Physics is that of Khovanov homology |15111152] 
in relation to Chern-Simons theory and the Jones polynomial. It may also be worth study¬ 
ing the case of the Fukaya categories related to Lagrangian Floer cohomology in symplectic 
geometry |153] and mirror symmetry. In the context of entropy, alas only for Sbgs ^md in 
the spirit of categorihcation, one may appreciate some unique insights and viewpoints explored 
in the recent |15411155] which may eventually turn out to be particularly useful and illuminating. 
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